Measuring and Reducing Gendered Correlations in Pre-trained Models

Kellie Webster

Xuezhi Wang

Ian Tenney

Alex Beutel

Emily Pitler

Ellie Pavlick

Jilin Chen

Ed H. Chi

Slav Petrov

arXiv (2020)

Download Google Scholar

Abstract

Large pre-trained models have revolutionized natural language understanding. However, researchers have found they can encode correlations undesired in many applications, like \emph{surgeon} being associated more with \emph{he} than \emph{she}. We explore such \emph{gendered correlations} as a case study, to learn how we can configure and train models to mitigate the risk of encoding unintended associations. We find that it is important to define correlation metrics, since they can reveal differences among models with similar accuracy. Large models have more capacity to encode gendered correlations, but this can be mitigated with general dropout regularization. Counterfactual data augmentation is also effective, and can even reduce correlations not explicitly targeted for mitigation, potentially making it useful beyond gender too. Both techniques yield models with comparable accuracy to unmitigated analogues, and still resist re-learning correlations in fine-tuning.

Research Areas

Natural Language Processing

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Measuring and Reducing Gendered Correlations in Pre-trained Models

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Measuring and Reducing Gendered Correlations in Pre-trained Models

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities