Are ever larger octopi still influenced by reporting biases?

Fangyu Liu

Jeremy Cole

Julian Martin Eisenschlos

Nigel Collier

AACL (2022)

Download Google Scholar

Abstract

Language models (LMs) trained on raw texts have no access to the real physical world environment. Gordon and Van Durme (2013) argue that they thus suffer from reporting bias, meaning that texts rarely report commonsensical facts about the world but more frequently talk about non-commonsenscical facts/events. If LMs naively overfit to the co-occurrence statistics of training corpora, they learn a biased view of the physical world. While prior studies have repeatedly verified that LMs of smaller scales (e.g. RoBERTa, GTP-2) amplifies reporting bias, it remains unknown whether such trends continue when models are scaled up. We investigate reporting bias in larger language models (LLMs) such as PaLM and GPT-3. Specifically, we query LLMs for the colour of objects, using colour as a representative property for visual commonsense. Surprisingly, we found that LLMs significantly outperform smaller LMs on answering queries about an object's typical colour. We find that LLMs' predictions deviate from corpus co-occurrence statistics induced from resources such as Google Books Ngram and are closer to human judgement. We believe this serves as evidence that larger LMs can overcome reporting bias, rather than showing an inverse scaling function as previously suggested.

Research Areas

Natural Language Processing

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Are ever larger octopi still influenced by reporting biases?

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Are ever larger octopi still influenced by reporting biases?

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities