Jump to Content

Multi-step Retriever-Reader Interaction for Scalable Open-domain Question Answering

Andrew McCallum
Rajarshi Das
Shehzaad Dhuliawala
ICLR (2019)
Google Scholar

Abstract

This paper introduces a new framework for open-domain question answering in which the retriever and the reader iteratively interact with each other. The framework is agnostic to the architecture of the machine reading model provided it has access to the token-level hidden representations of the reader. The retriever uses fast nearest neighbor search algorithms that allow it to scale to corpora containing millions of paragraphs. A gated recurrent unit updates the query at each step conditioned on the “state” of the reader and the “reformulated” query is used to re-rank the paragraphs by the retriever. We show the efficacy of our architecture by achieving state-of-the-art results (9.5% relative increase) on TriviaQA-unfiltered and we achieve competitive performance on other large open domain datasets such as QUASAR-T, SEARCHQA, and SQUAD-open. We conduct analysis and show that iterative interaction helps in retrieving useful paragraphs from the corpus. Finally, we show that our multi-step-reasoning framework brings uniform improvements when applied to two widely used reader architectures – Dr.QA and BiDAF.

Research Areas