MoQA: Benchmarking Multi-Type Open-Domain Question Answering

Howard Yen
Tianyu Gao
Danqi Chen
Proceedings of the Third DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering, Association for Computational Linguistics(2023), 8–29


Existing open-domain question answering research mainly focuses on questions that can be answered in a few words. However, information-seeking questions often require different formats of answers depending on the nature of questions, e.g., ``Why is there a maple leaf on the Canadian flag?'' In this paper, we present a new task, MOQA, which requires building QA models that can provide short, medium, long, and yes/no answers to open-domain questions simultaneously. We expand the Natural Questions dataset into the open-domain setting by keeping all types of questions and show that existing systems cannot generalize to these new types. We adapt state-of-the-art open-domain QA models---based on retriever-reader and phrase retrieval models---to tackle this task. Results and analyses of our multi-type QA models reveal the unique challenges of the task, calling for versatile QA models in the future.