On Measurements of Bias and Fairness in NLP

Emily Sheng
Jieyu Zhao
Aubrie Amstutz
Jiao Sun
Yu Hou
Mattie Sanseverino
Jiin Kim
Akihiro Nishi
Nanyun Peng
Kai-Wei Chang
AACL (2022)
Google Scholar

Abstract

Recent studies show that Natural Language Processing (NLP) models propagate societal biases about protected attributes such as gender, race, and nationality.
While existing works propose bias evaluation and mitigation methods for various tasks, there remains a need to cohesively understand the biases and normative harms these measures capture and how different measures compare. To address this gap, this work presents a comprehensive survey of existing bias measures in NLP---both intrinsic measures of representations and extrinsic measures of downstream applications---and organizes them through associated NLP tasks, metrics, datasets, societal biases, and corresponding harms. This survey also organizes commonly used NLP fairness metrics into different categories to present advantages, disadvantages, and correlations with general fairness metrics common in machine learning.