Sentence Compression by Deletion with LSTMs

Katja Filippova; Enrique Alfonseca; Carlos Colmenares; Lukasz Kaiser; Oriol Vinyals

Sentence Compression by Deletion with LSTMs

Katja Filippova

Enrique Alfonseca

Carlos Colmenares

Lukasz Kaiser

Oriol Vinyals

Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP'15)

Google Scholar

Abstract

We present an LSTM approach to
deletion-based sentence compression
where the task is to translate a sentence
into a sequence of zeros and ones, corresponding
to token deletion decisions.
We demonstrate that even the most basic
version of the system, which is given no
syntactic information (no PoS or NE tags,
or dependencies) or desired compression
length, performs surprisingly well: around
30% of the compressions from a large test
set could be regenerated. We compare the
LSTM system with a competitive baseline
which is trained on the same amount of
data but is additionally provided with
all kinds of linguistic features. In an
experiment with human raters the LSTM-based
model outperforms the baseline
achieving 4.5 in readability and 3.8 in
informativeness.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Sentence Compression by Deletion with LSTMs

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs