Jump to Content
Vinodkumar Prabhakaran

Vinodkumar Prabhakaran

Vinodkumar Prabhakaran is a Research Scientist at Google, working on issues around Ethical AI and ML Fairness. Prior to this, he was a postdoctoral researcher in the Computer Science department at Stanford University, where he worked with Prof. Dan Jurafsky and others at the Stanford NLP group, in an array of projects with a focus on applying Artificial Intelligence for Social Good. He obtained his PhD in computer science from Columbia University in 2015. His research brings together natural language processing techniques, machine learning algorithms, and social science methods to build scalable ways to identify and address large-scale societal issues such as racial disparities in policing, workplace incivility, gender bias and stereotypes, and abusive behavior online.
Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, desc
  • Year
  • Year, desc
    Preview abstract Chatbots based on large language models (LLM) exhibit a level of human-like behavior that promises to have profound impacts on how people access information, create content, and seek social support. Yet these models have also shown a propensity toward biases and hallucinations, i.e., make up entirely false information and convey it as truthful. Consequently, understanding and moderating safety risks in these models is a critical technical and social challenge. We use Bayesian multilevel models to explore the connection between rater demographics and their perception of safety in chatbot dialogues. We study a sample of 252 human raters stratified by gender, age, race/ethnicity, and location. Raters were asked to annotate the safety risks of 1,340 chatbot conversations. We show that raters from certain demographic groups are more likely to report safety risks than raters from other groups. We discuss the implications of these differences in safety perception and suggest measures to ameliorate these differences. View details
    MD3: The Multi-Dialect Dataset of Dialogues
    Clara Rivera
    Dora Demszky
    Devyani Sharma
    InterSpeech (2023) (to appear)
    Preview abstract We introduce a new dataset of conversational speech representing English from India, Nigeria, and the United States. Unlike prior datasets, the Multi-Dialect Dataset of Dialogues (MD3) strikes a balance between open-ended conversational speech and task-oriented dialogue by prompting participants to perform a series of short information-sharing tasks. This facilitates quantitative cross-dialectal comparison, while avoiding the imposition of a restrictive task structure that might inhibit the expression of dialect features. Preliminary analysis of the dataset reveals significant differences in syntax and in the use of discourse markers. The dataset includes more than 20 hours of audio and more than 200,000 orthographically-transcribed tokens, and is made publicly available at \url{https://www.kaggle.com/datasets/jacobeis99/md3en}. View details
    Preview abstract Machine learning approaches often require training and evaluation datasets with a clear separation between positive and negative examples. This risks simplifying and even obscuring the inherent subjectivity present in many tasks. Preserving such variance in content and diversity in datasets is often expensive and laborious. This is especially troubling when building safety datasets for conversational AI systems, as safety is both socially and culturally situated. To demonstrate this crucial aspect of conversational AI safety, and to facilitate in-depth model performance analyses, we introduce the DICES (Diversity In Conversational AI Evaluation for Safety) dataset that contains fine-grained demographic information about raters, high replication of ratings per item to ensure statistical power for analyses, and encodes rater votes as distributions across different demographics to allow for in￾depth explorations of different aggregation strategies. In short, the DICES dataset enables the observation and measurement of variance, ambiguity, and diversity in the context of conversational AI safety. We also illustrate how the dataset offers a basis for establishing metrics to show how raters’ ratings can intersects with demographic categories such as racial/ethnic groups, age groups, and genders. The goal of DICES is to be used as a shared resource and benchmark that respects diverse perspectives during safety evaluation of conversational AI systems. View details
    Preview abstract Along with the recent advances in large language modeling, there is growing concern that language technologies may reflect, propagate, and amplify various social stereotypes about groups of people. Publicly available stereotype benchmarks play a crucial role in detecting and mitigating this issue in language technologies to prevent both representational and allocational harms in downstream applications. However, existing stereotype benchmarks are limited in their size and coverage, largely restricted to stereotypes prevalent in the Western society. This is especially problematic as language technologies are gaining hold across the globe. To address this gap, we present SeeGULL, a broad-coverage stereotype dataset, expanding the coverage by utilizing the generative capabilities of large language models such as PaLM and GPT-3, and leveraging a globally diverse rater pool to validate prevalence of those stereotypes in society. SeeGULL is an order of magnitude larger in terms of size, and contains stereotypes for 179 identity groups spanning 6 continents, 8 different regions, 178 countries, 50 US states, and 31 Indian states and union territories. We also get fine-grained offensiveness scores for different stereotypes and demonstrate how stereotype perceptions for the same identity group differs across in-region vs out-region annotators. View details
    Preview abstract Measurements of fairness in NLP have been critiqued for lacking concrete definitions of biases or harms measured, and for perpetuating a singular, Western narrative of fairness globally. To combat some of these pivotal issues, methods for curating datasets and benchmarks that target specific harms are rapidly emerging. However, these methods still face the significant challenge of achieving coverage over global cultures and perspectives at scale. To address this, in this paper, we highlight the utility and importance of complementary approaches in these curation strategies, which leverage both community engagement as well as large generative models. We specifically target the harm of stereotyping and demonstrate a pathway to build a benchmark that covers stereotypes about diverse, and intersectional identities. View details
    Preview abstract Dialogue safety as a task is complex, in part because ‘safety’ entails a broad range of topics and concerns, such as toxicity, harm, legal concerns, health advice, etc. Who we ask to judge safety and who we ask to define safety may lead to differing conclusions. This is because definitions and understandings of safety can vary according to one’s identity, public opinion, and the interpretation of existing laws and regulations. In this study, we compare annotations from a diverse set of over 100 crowd raters to gold labels derived from trust and safety (T&S) experts in a dialogue safety task consisting of 350 human-chatbot conversations. We find patterns of disagreements rooted in dialogue structure, dialogue content, and rating rationale. In contrast to typical approaches which treat gold labels as ground truth, we propose alternative ways of interpreting gold data and incorporating crowd disagreement rather than mitigating it. We discuss the complexity of safety annotation as a task, what crowd and T&S labels each uniquely capture, and how to make determinations about when and how to rely on crowd or T&S labels. View details
    Preview abstract Human annotated data plays a crucial role in machine learning (ML) research and development. However, the ethical considerations around the processes and decisions that go into dataset annotation have not received nearly enough attention. In this paper, we survey an array of literature that provides insights into ethical considerations around crowdsourced dataset annotation. We synthesize these insights, and lay out the challenges in this space along two layers: (1) who the annotator is, and how the annotators' lived experiences can impact their annotations, and (2) the relationship between the annotators and the crowdsourcing platforms, and what that relationship affords them. Finally, we introduce a novel framework, CrowdWorkSheets, for dataset developers to facilitate transparent documentation of key decisions points at various stages of the data annotation pipeline: task formulation, selection of annotators, platform and infrastructure choices, dataset analysis and evaluation, and dataset release and maintenance. View details
    Preview abstract Recent research has revealed undesirable biases in NLP data and models. However, these efforts focus of social disparities in West, and are not directly portable to other geo-cultural contexts. In this position paper, we outline a holistic research agenda to re-contextualize NLP fairness research for the Indian context, accounting for Indian \textit{societal context}, bridging \textit{technological} gaps in capability \& resources, and adapting to Indian cultural \textit{values}. We also report high-level findings from an empirical study on various social stereotypes for Region and Religion axes in the Indian context, demonstrating its prevalence in corpora and models. View details
    The Reasonable Effectiveness of Diverse Evaluation Data
    Christopher Homan
    Alex Taylor
    Human Evaluation for Generative Models (HEGM) Workshop at NeurIPS2022
    Preview abstract In this paper, we present findings from an semi-experimental exploration of rater diversity and its influence on safety annotations of conversations generated by humans talking to a generative AI-chat bot. We find significant differences in judgments produced by raters from different geographic regions and annotation platforms, and correlate these perspectives with demographic sub-groups. Our work helps define best practices in model development-- specifically human evaluation of generative models-- on the backdrop of growing work on sociotechnical AI evaluations. View details
    Re-contextualizing Fairness in NLP: The Case of India
    Shaily Bhatt
    In Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (AACL-IJCNLP) (2022)
    Preview abstract Recent research has revealed undesirable biases in NLP data and models. However, these efforts focus of social disparities in West, and are not directly portable to other geo-cultural contexts. In this paper, we focus on NLP fair-ness in the context of India. We start with a brief account of the prominent axes of social disparities in India. We build resources for fairness evaluation in the Indian context and use them to demonstrate prediction biases along some of the axes. We then delve deeper into social stereotypes for Region and Religion, demonstrating its prevalence in corpora and models. Finally, we outline a holistic research agenda to re-contextualize NLP fairness research for the Indian context, ac-counting for Indian societal context, bridging technological gaps in NLP capabilities and re-sources, and adapting to Indian cultural values.While we focus on India, this framework can be generalized to other geo-cultural contexts. View details
    Preview abstract Testing, within the machine learning (ML) community, has been predominantly about assessing a learned model's predictive performance measured against a test dataset. This test dataset is often a held-out subset of the dataset used to train the model, and hence expected to follow the same data distribution as the training dataset. While recent work on robustness testing within ML has pointed to the importance of testing against distributional shifts, these efforts also focus on estimating the likelihood of the model making an error against a reference dataset/distribution. In this paper, we argue that this view of testing actively discourages researchers and developers from looking into many other sources of robustness failures, for instance corner cases. We draw parallels with decades of work within software engineering testing focused on assessing a software system against various stress conditions, including corner cases, as opposed to solely focusing on average-case behaviour. Finally, we put forth a set of recommendations to broaden the view of machine learning testing to a rigorous practice. View details
    PaLM: Scaling Language Modeling with Pathways
    Sharan Narang
    Jacob Devlin
    Maarten Bosma
    Hyung Won Chung
    Sebastian Gehrmann
    Parker Schuh
    Sasha Tsvyashchenko
    Abhishek Rao
    Yi Tay
    Noam Shazeer
    Nan Du
    Reiner Pope
    James Bradbury
    Guy Gur-Ari
    Toju Duke
    Henryk Michalewski
    Xavier Garcia
    Liam Fedus
    David Luan
    Barret Zoph
    Ryan Sepassi
    David Dohan
    Shivani Agrawal
    Mark Omernick
    Marie Pellat
    Aitor Lewkowycz
    Erica Moreira
    Rewon Child
    Oleksandr Polozov
    Zongwei Zhou
    Michele Catasta
    Jason Wei
    arxiv:2204.02311 (2022)
    Preview abstract Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning, which drastically reduces the number of task-specific training examples needed to adapt the model to a particular application. To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM. We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML system which enables highly efficient training across multiple TPU Pods. We demonstrate continued benefits of scaling by achieving state-of-the-art few-shot learning results on hundreds of language understanding and generation benchmarks. On a number of these tasks, PaLM 540B achieves breakthrough performance, outperforming the finetuned state-of-the-art on a suite of multi-step reasoning tasks, and outperforming average human performance on the recently released BIG-bench benchmark. A significant number of BIG-bench tasks showed discontinuous improvements from model scale, meaning that performance steeply increased as we scaled to our largest model. PaLM also has strong capabilities in multilingual tasks and source code generation, which we demonstrate on a wide array of benchmarks. We additionally provide a comprehensive analysis on bias and toxicity, and study the extent of training data memorization with respect to model scale. Finally, we discuss the ethical considerations related to large language models and discuss potential mitigation strategies. View details
    Preview abstract In order to build trust that a machine learned model is appropriate and responsible within a systems context involving technical and human components, a broad range of factors typically need to be considered. However in practice model evaluations frequently focus on only a narrow range of expected predictive behaviours. This paper examines the critical evaluation gap between the idealized breadth of concerns and the observed narrow focus of actual evaluations. In doing so, we demonstrate which values are centered—and which are marginalized—within the machine learning community. Through an empirical study of machine learning papers from recent high profile conferences, we demonstrate the discipline’s general focus on a small set of evaluation methods. By considering the mathematical formulations of evaluation metrics and the test datasets over which they are calculated, we draw attention to which properties of models are centered in the field. This analysis also reveals an important gap: the properties of models which are frequently neglected or sidelined during evaluation. By studying the structure of this gap, we demonstrate the machine learning discipline’s implicit assumption of a range of commitments which have normative impacts; these include commitments to consequentialism, abstractability from context, the quantifiability of impacts, the irrelevance of non-predictive features, and the equivalence of different failure modes. Shedding light on these assumptions and commitments enables us to question their appropriateness for different ML system contexts, and points the way towards more diverse and contextualized evaluation methodologies which can be used to more robustly examine the trustworthiness of ML models. View details
    Preview abstract Questions regarding implicitness, ambiguity and underspecification are crucial for multimodal image+text systems, but have received little attention to date. This paper maps out a conceptual framework to address this gap for systems which generate images from text inputs, specifically for systems which generate images depicting scenes from descriptions of those scenes. In doing so, we account for how texts and images convey different forms of meaning. We then outline a set of core challenges concerning textual and visual ambiguity and specificity tasks, as well as risks that may arise from improper handling of ambiguous and underspecified elements. We propose and discuss two strategies for addressing these challenges: a) generating a visually ambiguous output image, and b) generating a set of diverse output images. View details
    LaMDA: Language Models for Dialog Applications
    Aaron Daniel Cohen
    Alena Butryna
    Alicia Jin
    Apoorv Kulshreshtha
    Ben Zevenbergen
    Chung-ching Chang
    Cosmo Du
    Daniel De Freitas Adiwardana
    Dehao Chen
    Dmitry (Dima) Lepikhin
    Erin Hoffman-John
    Igor Krivokon
    James Qin
    Jamie Hall
    Joe Fenton
    Johnny Soraker
    Maarten Paul Bosma
    Marc Joseph Pickett
    Marcelo Amorim Menegali
    Marian Croak
    Maxim Krikun
    Noam Shazeer
    Rachel Bernstein
    Ravi Rajakumar
    Ray Kurzweil
    Romal Thoppilan
    Steven Zheng
    Taylor Bos
    Toju Duke
    Tulsee Doshi
    Vincent Y. Zhao
    Will Rusch
    Yuanzhong Xu
    arXiv (2022)
    Preview abstract We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and arepre-trained on 1.56T words of public dialog data and web text. While model scaling alone canimprove quality, it shows less improvements on safety and factual grounding. We demonstrate thatfine-tuning with annotated data and enabling the model to consult external knowledge sources canlead to significant improvements towards the two key challenges of safety and factual grounding.The first challenge, safety, involves ensuring that the model’s responses are consistent with a set ofhuman values, such as preventing harmful suggestions and unfair bias. We quantify safety using ametric based on an illustrative set of values, and we find that filtering candidate responses using aLaMDA classifier fine-tuned with a small amount of crowdworker-annotated data offers a promisingapproach to improving model safety. The second challenge, factual grounding, involves enabling themodel to consult external knowledge sources, such as an information retrieval system, a languagetranslator, and a calculator. We quantify factuality using a groundedness metric, and we find that ourapproach enables the model to generate responses grounded in known sources, rather than responsesthat merely sound plausible. Finally, we explore the use of LaMDA in the domains of education andcontent recommendations, and analyze their helpfulness and role consistency. View details
    Frameworks and Challenges to Participatory AI
    Abeba Birhane
    William Samuel Isaac
    Madeleine Clare Elish
    Iason Gabriel
    Shakir Mohamed
    In Proceeding of the Second Conference on Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO '22), ACM (2022)
    Preview abstract Participatory approaches to artificial intelligence (AI) and machine learning (ML) are gaining momentum: the increased attention comes partly with the view that participation opens the gateway to an inclusive, equitable, robust, responsible and trustworthy AI. Among other benefits, participatory approaches are essential to understanding and adequately representing the needs, desires and perspectives of historically marginalized communities. However, there currently exists lack of clarity on what meaningful participation entails and what it is expected to do. In this paper we first review participatory approaches as situated in historical contexts as well as participatory methods and practices within the AI and ML pipeline. We then introduce three case studies in participatory AI. Participation holds the potential for beneficial, emancipatory and empowering technology design, development and deployment while also being at risk for concerns such as cooptation and conflation with other activities. We lay out these limitations and concerns and argue that as participatory AI/ML becomes in vogue, a contextual and nuanced understanding of the term as well as consideration of who the primary beneficiaries of participatory activities ought to be constitute crucial factors to realizing the benefits and opportunities that participation brings. View details
    A Human Rights Approach to Responsible AI
    Iason Gabriel
    Margaret Mitchell
    Timnit Gebru
    EAAMO 2022 (non-archival poster), ACM (2022)
    Preview abstract Research on fairness, accountability, transparency and ethics of AI-based interventions in society has gained much-needed momentum in recent years. However it lacks an explicit alignment with a set of normative values and principles that guide this research and interventions. Rather, an implicit consensus is often assumed to hold for the values we impart into our models – something that is at odds with the pluralistic world we live in. In this paper, we put forth the doctrine of universal human rights as a set of globally salient and cross-culturally recognized set of values that can serve as a grounding framework for explicit value alignment in responsible AI – and discuss its efficacy as a framework for civil society partnership and participation. We argue that a human rights framework orients the research in this space away from the machines and the risks of their biases, and towards humans and the risks to their rights, essentially helping to center the conversation around who is harmed, what harms they face, and how those harms may be mitigated. View details
    Preview abstract Human annotations play a crucial role in machine learning (ML) research and development. However, the ethical considerations around the processes and decisions that go into building ML datasets, essentially shaping the research trajectories within our field, has not gotten nearly enough attention. In this paper, we survey an array of literature on human computation, with a focus on ethical considerations around crowdsourcing. We synthesize these insights, and lay out the challenges in this space along two layers: (1) who the annotator is and how the annotators' lived experiences can impact their annotations, and (2) the relationship between the annotators and the crowdsourcing platforms and what that relationship affords them. Finally, we put forth a concrete set of recommendations and considerations for dataset developers at various stages of the ML data pipeline: task formulation, selection of annotators, platform and infrastructure choices, dataset analysis and evaluation, and dataset documentation and release. View details
    Preview abstract Conventional algorithmic fairness is West-centric, as seen in its sub-groups, values, and optimisations. In this paper, we de-center algorithmic fairness and analyse AI power in India. Based on 36 qualitative interviews and a discourse analysis of algorithmic deployments in India, we find that several assumptions of algorithmic fairness are challenged in India. We find that data is not always reliable due to socio-economic factors, users are given third world treatment by ML makers, and AI signifies unquestioning aspiration. We contend that localising model fairness alone can be window dressing in India, where the distance between models and oppressed communities is large. Instead, we re-imagine algorithmic fairness in India and provide a roadmap to re-contextualise data and models, empower oppressed communities, and enable Fair-ML ecosystems. View details
    Preview abstract Building equitable and inclusive technologies demands paying attention to how social attitudes towards persons with disabilities are represented within technology. Representations perpetuated by NLP models often inadvertently encode undesirable social biases from the data on which they are trained. In this paper, first we present evidence of such undesirable biases towards mentions of disability in two different NLP models: toxicity prediction and sentiment analysis. Next, we demonstrate that neural embeddings that are critical first steps in most NLP pipelines also contain undesirable biases towards mentions of disabilities. We then expose the topical biases in the social discourse about some disabilities which may explain such biases in the models; for instance, terms related to gun violence, homelessness, and drug addiction are over-represented in discussions about mental illness. View details
    Participatory Problem Formulation for Fairer Machine Learning Through Community Based System Dynamics
    Jill Kuhlberg
    William Samuel Isaac
    Machine Learning in Real Life (ML-IRL) ICLR 2020 Workshop (2020), pp. 6
    Preview abstract Recent research on algorithmic fairness has highlighted that the problem formulation phase of ML system development can be a key source of bias that has significant downstream impacts on ML system fairness outcomes. However, very little attention has been paid to methods for improving the fairness efficacy of this critical phase of ML system development. Current practice neither accounts for the dynamic complexity of high-stakes domains nor incorporates the perspectives of vulnerable stakeholders. In this paper we introduce community based system dynamics (CBSD) as an approach to enable the participation of typically excluded stakeholders in the problem formulation phase of the ML system development process and facilitate the deep problem understanding required to mitigate bias during this crucial stage. View details
    Preview abstract Machine learning (ML) fairness research tends to focus primarily on mathematically-based interventions on often opaque algorithms or models and/or their immediate inputs and outputs. Recent re-search has pointed out the limitations of fairness approaches that rely on oversimplified mathematical models that abstract away the underlying societal context where models are ultimately deployed and from which model inputs and complex socially constructed concepts such as fairness originate. In this paper, we outline three new tools to improve the comprehension, identification and representation of societal context. First, we propose a complex adaptive systems(CAS) based model and definition of societal context that may help researchers and product developers expand the abstraction boundary of ML fairness work to include societal context. Second, we introduce collaborative causal theory formation (CCTF)as a key capability for establishing a socio-technical frame that incorporates diverse mental models and associated causal theories in modeling the problem and solution space for ML-based products. Finally, we identify system dynamics (SD) as an established, transparent and rigorous framework for practicing CCTF during all phases of the ML product development process. We conclude with a discussion of how these systems-based approaches to understanding the societal context within which socio-technical systems are embedded can improve the development of fair and inclusive ML-based products. View details
    Preview abstract Data-driven statistical Natural Language Processing (NLP) techniques leverage large amounts of language data to build models that can understand language. However, most language data reflect the public discourse at the time the data was produced, and hence NLP models are susceptible to learning incidental associations around named referents at a particular point in time, in addition to general linguistic meaning. An NLP system designed to model notions such as sentiment and toxicity should ideally produce scores that are independent of the identity of such entities mentioned in text and their social associations. For example, in a general purpose sentiment analysis system, a phrase such as I hate Katy Perry should be interpreted as having the same sentiment as I hate Taylor Swift. Based on this idea, we propose a generic evaluation framework, Perturbation Sensitivity Analysis, which detects unintended model biases related to named entities, and requires no new annotations or corpora. We demonstrate the utility of this analysis by employing it on two different NLP models --- a sentiment model and a toxicity model --- applied on online comments in English language from four different genres. View details
    Preview abstract Persons with disabilities face many barriers to participation in society, and the rapid advancement of technology creates ever more. Achieving fair opportunity and justice for people with disabilities demands paying attention not just to accessibility, but also to the attitudes towards, and representations of, disability that are implicit in machine learning (ML) models that are pervasive in how one engages with the society. However such models often inadvertently learn to perpetuate undesirable social biases from the data on which they are trained. This can result, for example, in models for classifying text producing very different predictions for {\em I stand by a person with mental illness}, and {\em I stand by a tall person}. We present evidence of such social biases in existing ML models, along with an analysis of biases in a dataset used for model development. View details
    No Results Found