Source-summary Entity Aggregation in Abstractive Summarization

Jose Angel
Jackie C. K. Cheung
COLING 2022 (to appear)
Google Scholar

Abstract

In a discourse, specific entities that are mentioned can later be referred to by a more general description. For example, 'Celine Dion' and 'Justin Bieber' can be referred to by 'Canadian singers' or 'celebrities'. In this work, we study this phenomenon in the context of summarization, where entities drawn from a source text are generalized in the summary. We call such instances 'source-summary entity aggregations'. We categorize and study several types of source-summary entity aggregations in the CNN/Dailymail corpus, showing that they are reasonably frequent. We experimentally analyze the capabilities of three state-of-the-art summarization systems for generating such aggregations within summaries. We also explore how they can be encouraged to generate more aggregations. Our results show that there is significant room for improvement in generating semantically correct and appropriate aggregations.