Decoupled Context Processing for Context Augmented Language Modeling
Abstract
Language models can be augmented with context retriever to incorporate knowl-edge from large external databases. By leveraging retrieved context, the neural net-work does not have to memorize the massive amount of world knowledge within its internal parameters, leading to better parameter efficiency, interpretability and mod-ularity. In this paper we examined a simple yet effective architecture for incorporat-ing external context into language models based on decoupled Encoder-Decoder architecture. We showed that such a simple architecture achieves competitive results on auto-regressive language modeling and open domain question answer-ing tasks. We also analyzed the behavior of the proposed model which performs grounded context transfer. Finally we discussed the computational implications of such retrieval augmented models.