Sentence Retrieval for Open-Ended Dialogue using Dual Contextual Modeling

Itay Harel
Hagai Taitelbaum
Oren Kurland
CIKM 2021 (2023)

Abstract

We address the task of sentence retrieval for open-ended dialogues.The goal is to retrieve sentences from a document corpus that con-tain information useful for generating the next turn in a givendialogue. To this end, we propose several novel architectures fordual contextual modeling: the dialogue context and the context ofthe sentence in its ambient document. The architectures utilize fine-tuned contextualized language models (BERT). We are not aware ofprevious work that modeled the context of the sentence (passage)to be retrieved in a dialogue setting. Furthermore, some of the tech-niques we present for modeling the dialogue context are novel tothis study. To evaluate the models, we constructed a test-set thatincludes open-ended dialogues from Reddit, candidate sentencesfrom Wikipedia for each dialogue and human annotations for thesentences. To train the neural-based models, we devised a weaksupervision method applied to a large-scale Reddit dataset. Weempirically compared our models with a wide array of strong ref-erence comparisons. The performance of our most effective modelis substantially superior to that of all baselines, demonstrating themerits of our novel architectures and weakly-supervised trainingapproach.