ReadTwice: Reading Very Large Documents with Memories

Michiel de Jong
Ilya Eckstein
Joshua Ainslie
Yury Zemlyanskiy
Proceedings of NAACL (2021) (to appear)

Abstract

Knowledge-intensive tasks such as question answering often require assimilating information from different sections of large inputs, like books or collections of articles. We propose ReadTwice, a simple and effective approach to combine the advantages of existing approaches that modify Transformers to model long-range dependencies. The main idea is to read smaller segments of the text and summarize them into a memory table to be used in a second read of the text. We show that the model outperforms models of comparable size on several QA datasets and sets the state of the art on the challenging NarrativeQA dataset which asks questions about entire books.