Remi Denton

Remi Denton

Authored Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Preview abstract Recent studies have highlighted the issue of varying degrees of stereotypical depictions for different identity group. However, these existing approaches have several key limitations, including a noticeable lack of coverage of identity groups in their evaluation, and the range of their associated stereotypes. Additionally, these studies often lack a critical distinction between inherently visual stereotypes, such as `brown' or `sombrero', and culturally influenced stereotypes like `kind' or `intelligent'. In this work, we address these limitations by grounding our evaluation of regional, geo-cultural stereotypes in the generated images from Text-to-Image models by leveraging existing textual resources. We employ existing stereotype benchmarks to evaluate stereotypes and focus exclusively on the identification of visual stereotypes within the generated images spanning 135 identity groups. We also compute the offensiveness across identity groups, and check the feasibility of identifying stereotypes automatically. Further, through a detailed case study and quantitative analysis, we reveal how the default representations of all identity groups have a more stereotypical appearance, and for historically marginalized groups, how the images across different attributes are visually more similar than other groups, even when explicitly prompted otherwise. View details
    Preview abstract This paper reports on disability representation in images output from text-to-image (T2I) generative AI systems. Through eight focus groups with 25 people with disabilities, we found that models repeatedly presented reductive archetypes for different disabilities. Often these representations reflected broader societal stereotypes and biases, which our participants were concerned to see reproduced through T2I. Our participants discussed further challenges with using these models including the current reliance on prompt engineering to reach satisfactorily diverse results. Finally, they offered suggestions for how to improve disability representation with solutions like showing multiple, heterogeneous images for a single prompt and including the prompt with images generated. Our discussion reflects on tensions and tradeoffs we found among the diverse perspectives shared to inform future research on representation-oriented generative AI system evaluation metrics and development processes. View details
    Towards Globally Responsible Generative AI Benchmarks
    Rida Qadri
    ICLR Workshop : Practical ML for Developing Countries Workshop (2023)
    Preview abstract As generative AI globalizes, there is an opportunity to reorient our nascent development frameworks and evaluative practices towards a global context. This paper uses lessons from a community-centered study on the failure modes of text to Image models in the South Asian context, to give suggestions on how the AI/ML community can develop culturally and contextually situated benchmarks. We present three forms of mitigations for culturally situated- evaluations: 1) diversifying our diversity measures 2) participatory prompt dataset curation 2) multi-tiered evaluations structures for community engagement. Through these mitigations we present concrete methods to make our evaluation processes more holistic and human-centered while also engaging with demands of deployment at global scale. View details
    AI’s Regimes of Representation: A Community-centered Study of Text-to-Image Models in South Asia
    Rida Qadri
    Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, Association for Computing Machinery, 506–517
    Preview abstract This paper presents a community-centered study of cultural limitations of text-to-image (T2I) models in the South Asian context. We theorize these failures using scholarship on dominant media regimes of representations and locate them within participants’ reporting of their existing social marginalizations. We thus show how generative AI can reproduce an outsiders gaze for viewing South Asian cultures, shaped by global and regional power inequities. By centering communities as experts and soliciting their perspectives on T2I limitations, our study adds rich nuance into existing evaluative frameworks and deepens our understanding of the culturally-specific ways AI technologies can fail in non-Western and Global South settings. We distill lessons for responsible development of T2I models, recommending concrete pathways forward that can allow for recognition of structural inequalities. View details
    Preview abstract Large language models (LLMs) trained on real-world data can inadvertently reflect harmful societal biases, particularly toward historically marginalized communities. While previous work has primarily focused on harms related to age and race, emerging research has shown that biases toward disabled communities exist. This study extends prior work exploring the existence of harms by identifying categories of LLM-perpetuated harms toward the disability community. We conducted 19 focus groups, during which 56 participants with disabilities probed a dialog model about disability and discussed and annotated its responses. Participants rarely characterized model outputs as blatantly offensive or toxic. Instead, participants used nuanced language to detail how the dialog model mirrored subtle yet harmful stereotypes they encountered in their lives and dominant media, e.g., inspiration porn and able-bodied saviors. Participants often implicated training data as a cause for these stereotypes and recommended training the model on diverse identities from disability-positive resources. Our discussion further explores representative data strategies to mitigate harm related to different communities through annotation co-design with ML researchers and developers. View details
    Preview abstract Human annotated data plays a crucial role in machine learning (ML) research and development. However, the ethical considerations around the processes and decisions that go into dataset annotation have not received nearly enough attention. In this paper, we survey an array of literature that provides insights into ethical considerations around crowdsourced dataset annotation. We synthesize these insights, and lay out the challenges in this space along two layers: (1) who the annotator is, and how the annotators' lived experiences can impact their annotations, and (2) the relationship between the annotators and the crowdsourcing platforms, and what that relationship affords them. Finally, we introduce a novel framework, CrowdWorkSheets, for dataset developers to facilitate transparent documentation of key decisions points at various stages of the data annotation pipeline: task formulation, selection of annotators, platform and infrastructure choices, dataset analysis and evaluation, and dataset release and maintenance. View details
    Do Datasets Have Politics? Disciplinary Values in Computer Vision Dataset Development
    Morgan Klaus Scheuerman
    Alex Hanna
    The 24th ACM Conference on Computer-Supported Cooperative Work and Social Computing (2021)
    Preview abstract Data is a crucial component of machine learning; a model is reliant on data to train, validate, and test it. With increased technical capabilities, machine learning research has boomed in both academic and industry settings---and one major focus has been on computer vision. Computer vision is a popular domain of machine learning increasingly pertinent to real world applications, from facial recognition in policing to object detection for autonomous vehicles. Given computer vision’s propensity to shape machine learning research practices and impact human life, we sought to understand disciplinary practices around dataset documentation---how data is collected, curated, annotated, and packaged into datasets for computer vision researchers and practitioners to use for model tuning and development. Specifically, we examined what dataset documentation communicates about the underlying values of vision data and the larger practices and goals of computer vision as a field. To conduct this study, we collected a large corpus of computer vision datasets, from which we sampled 114 databases across different vision tasks. We document a number of values around accepted data practices, what makes desirable data, and the treatment of humans in the dataset construction process. We discuss how computer vision database authors value efficiency at the expense of care; universality at the expense of contextuality; impartiality at the expense of positionality; and model work at the expense of data work. Many of the silenced values we identified sit in opposition with human-centered data practices, which we reference in our suggestions for better incorporating silenced values into the dataset curation process. View details
    Preview abstract In response to growing concerns of bias, discrimination, and unfairness perpetuated by algorithmic systems, the datasets used to train and evaluate machine learning models have come under increased scrutiny. Many of these examinations have focused on the contents of machine learning datasets, finding glaring underrepresentation of minoritized groups. In contrast, relatively little work has been done to examine the norms, values, and assumptions embedded in these datasets. In this work, we conceptualize machine learning datasets as a type of informational infrastructure, and motivate a genealogy as method in examining the histories and modes of constitution at play in their creation. We present a critical history of ImageNet as an exemplar, utilizing critical discourse analysis of major texts around ImageNet’s creation and impact. We find that assumptions around ImageNet and other large computer vision datasets more generally rely on three themes: the aggregation and accumulation of more data, the computational construction of meaning, and making certain types of data labor invisible. By tracing the discourses that surround this influential benchmark, we contribute to the ongoing development of the standards and norms around data development in machine learning and artificial intelligence research. View details
    Art Sheets for Art Datasets
    Ramya Malur Srinivasan
    Jordan Jennifer Famularo
    Beth Coleman
    NeurIPS Dataset & Benchmark track (2021)
    Preview abstract As machine learning (ML) techniques are being employed to authenticate artworks and estimate their market value, computational tasks have expanded across a variety of creative domains and datasets drawn from the arts. With recent progress in generative modeling, ML techniques are also used for simulating artistic styles and for producing new content in various media such as music, visual arts, poetry, etc. While this progress has opened up new creative avenues, it has also paved the way for adverse downstream effects such as cultural appropriation (e.g., cultural misrepresentation, offense, and undervaluing) and amplification of gender and racial stereotypes, to name a few. Many such concerning issues stem from the training data in ways that diligent evaluation can uncover, prevent, and mitigate. In this paper, we provide a checklist of questions customized for use with art datasets, building on the questionnaire for datasets provided in Datasheets, by guiding assessment of developer motivation together with dataset provenance, composition, collection, pre-processing, cleaning, labeling, use (including data generation/synthesis), distribution, and maintenance. Case studies exemplify the value of our questionnaire. We hope our work aids ML scientists and developers by providing a framework for responsible design, development, and use of art datasets. View details
    Preview abstract Human annotations play a crucial role in machine learning (ML) research and development. However, the ethical considerations around the processes and decisions that go into building ML datasets, essentially shaping the research trajectories within our field, has not gotten nearly enough attention. In this paper, we survey an array of literature on human computation, with a focus on ethical considerations around crowdsourcing. We synthesize these insights, and lay out the challenges in this space along two layers: (1) who the annotator is and how the annotators' lived experiences can impact their annotations, and (2) the relationship between the annotators and the crowdsourcing platforms and what that relationship affords them. Finally, we put forth a concrete set of recommendations and considerations for dataset developers at various stages of the ML data pipeline: task formulation, selection of annotators, platform and infrastructure choices, dataset analysis and evaluation, and dataset documentation and release. View details