MiQA: A Benchmark for Inference on Metaphorical Questions

Iulia Comșa

Julian Martin Eisenschlos

Srini Narayanan

AACL-IJCNLP(2022) (to appear)

Download Google Scholar

Abstract

We propose a benchmark to assess the capability of large language models to reason with metaphor. Our benchmark combines the previously isolated topics of metaphor detection and commonsense reasoning into a single task that requires a model to make inferences by accurately selecting between the literal and metaphorical register. We examine the performance of state-of-the-art pretrained models on forced-choice tasks and find a large discrepancy between small and very large models, going from chance- to human-level performance. However, upon examining the generative performance of the largest model, we find that there is still a gap to bridge before human performance is reached in a more natural conversational setting.

Research Areas

Natural Language Processing
Machine Intelligence

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

MiQA: A Benchmark for Inference on Metaphorical Questions

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

MiQA: A Benchmark for Inference on Metaphorical Questions

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities