Hurdles to Progress in Long-form Question Answering

Aurko Roy

Kalpesh Krishna

Mohit Iyyer

NAACL (2021)

Google Scholar

Abstract

There has been remarkable recent progress in factoid open-domain question answering (QA), where a short phrase or entity is sufficient to answer the question. A lot less work has been done in the more challenging task of long-form QA, where the goal is to generate elaborate, paragraph-long answers to more open-ended questions. In this work, we present a new system based on sparse attention and contrastive retriever learning, which achieves state-of-the-art performance on ELI5, a popular long-form QA dataset in the KILT benchmark (Petroni et al. 2020). However, a detailed analysis of our system reveals several concerning trends which are hampering progress in this important area: (1) little to no evidence our model's generations are actually grounded in the retrieved documents, a desirable property which is not captured by metrics in the KILT benchmark; (2) a significant training / valid / test set overlap in ELI5, with atleast 75\% validation questions having a paraphrased question in training data; (3) significant issues in the use of the popular evaluation metric ROUGE-L, with a very low margin of improvement (2-5 ROUGE-L) from lower-bound trivial baselines (like input copying) to upper-bound reference baselines; (4) inherent difficulty of human evaluation in this task due to long length of generated answers and unfamiliarity with topics.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Hurdles to Progress in Long-form Question Answering

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Hurdles to Progress in Long-form Question Answering

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities