Google Research

Auditing Gender Presentation Differences in Text-to-Image Models

  • Yanzhe Zhang
  • Lu Jiang
  • Greg Turk
  • Diyi Yang
(2023) (to appear)


Text-to-image models, which can generate high-quality images based on textual input, have recently enabled various content-creation tools. Despite significantly affecting a wide range of downstream applications, the distributions of these generated images still need to be comprehensively understood, especially regarding the potential stereotypical attributes of different genders. In this work, we propose a paradigm that utilizes fine-grained self-presentation attributes to study how different genders are presented differently in text-to-image models, namely Gender Presentation Differences. By probing the gender indicators in the input text (e.g., a woman'' ora man''), we quantify the frequency differences of human-centric attributes (e.g., a shirt'' anda dress'') through human annotation and introduce two novel metrics: GEP (GEnder Presentation Differences) vector and GEP score. Furthermore, the proposed automatic estimation of the two metrics correlates better with human annotations than existing CLIP-based measures, consistently across three state-of-the-art text-to-image models. Finally, we demonstrate that our metrics can generalize to gender/racial stereotypes related to occupations.

Research Areas

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work