Jonathan Mallinson
Research Areas
Authored Publications
Sort By
Text Generation with Text-Editing Models
Daniil Mirylenka
Jakub Adamek
Yue Dong
Proceedings of NAACL 2022, ACL
Preview abstract
Text-editing models have recently become a prominent alternative to seq2seq models for monolingual natural language generation (NLG) tasks such as grammatical error correction, text simplification, and style transfer. These tasks exhibit a large amount of textual overlap between the source and target texts. Text-editing models take advantage of this trait and learn to generate the output by predicting edit operations applied to the source sequence in contrast to seq2seq models that generate the output from scratch. Text-editing models provide several benefits over seq2seq models including faster inference speed, higher sample efficiency, and better control and interpretability of the outputs. This tutorial provides a comprehensive overview of the text-edit based approaches and current state-of-the-art models, analyzing the pros and cons of different methods. We discuss challenges related to productionization and how these models can to help mitigate hallucination and bias, both pressing challenges in the field of text generation.
View details
Preview abstract
ASR Error Detection (AED) models aim to post-process the output of Automatic Speech Recognition (ASR) systems, in order to detect transcription errors. Modern approaches usually use text-based input, comprised of the ASR transcription hypothesis, without leveraging additional signals from the ASR model, resulting in loss of important acoustic information. In this work, we propose to utilize the ASR system's word-level confidence scores for improving AED performance. Specifically, we propose to add an ASR Confidence Embedding (ACE) layer to the AED model's encoder, allowing us to jointly encode the confidence scores and the transcribed text into a contextualized representation. Our experiments show the benefits of ASR confidence scores for AED, their complementary effect over the textual signal, as well as the effectiveness and robustness of our approach for combining those signals. To foster further research, we curate and publish a novel AED dataset consisting of ASR outputs on the LibriSpeech corpus with annotated transcription errors.
View details
Preview abstract
We propose a new model for grammatical error correction (GEC) which builds on a very large multilingual masked language model, covering 101 languages. To adapt our model for the GEC task, we design an unsupervised, language-agnostic pretraining objective that mimics corrections typically contained in labeled data. After finetuning on gold data, we surpass the previous state-of-the-art results on the four evaluated languages (Czech, English, German and Russian). This approach shows the power of large multilingual language models. Due to these models being non-trivial to run on non-cluster infrastructure, we employ our model to clean up the labels in the popular yet noisy Lang-8 dataset. We release this dataset and hope that the community will find it useful for further advancement of GEC.
View details
Preview abstract
We present FELIX --- a flexible text-editing approach for generation, designed to derive maximum benefit from the ideas of decoding with bi-directional contexts and self-supervised pre-training. In contrast to conventional sequence-to-sequence (seq2seq) models, FELIX is efficient in low-resource settings and fast at inference time, while being capable of modeling flexible input-output transformations. We achieve this by decomposing the text-editing task into two sub-tasks: tagging to decide on the subset of input tokens and their order in the output text and insertion to in-fill the missing tokens in the output not present in the input. The tagging model employs a novel Pointer mechanism, while the insertion model is based on a Masked Language Model. Both of these models are chosen to be non-autoregressive to guarantee faster inference. FELIX performs favourably when compared to recent text-editing methods and strong seq2seq baselines when evaluated on four NLG tasks: Sentence Fusion, Machine Translation Automatic Post-Editing, Summarization, and Text Simplification.
View details