Toward Deconfounding the Influence of Entity Demographics for Question Answering Accuracy

Maharshi Gor

Kellie Webster

Jordan Boyd-Graber

EMNLP (2021)

Download Google Scholar

Abstract

Question Answering (QA) tasks are used as benchmarks of general machine intelligence. Therefore, robust QA evaluation is critical, and metrics should indicate how models will answer _any_ question. However, major QA datasets have skewed distributions over gender, profession, and nationality. Despite that skew, models generalize---we find little evidence that accuracy is lower for people based on gender or nationality. Instead, there is more variation in question topic and question ambiguity. Adequately accessing the generalization of \abr{qa} systems requires more representative datasets.

Research Areas

Natural Language Processing
Responsible AI

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Toward Deconfounding the Influence of Entity Demographics for Question Answering Accuracy

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Toward Deconfounding the Influence of Entity Demographics for Question Answering Accuracy

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities