Findings of the 2021 Conference on Machine Translation (WMT21)

Farhad Akhbardeh
Arkady Arkhangorodsky
Magdalena Biesialska
Ondrej Bojar
Rajen Chatterjee
Vishrav Chaudhary
Marta R. Costa-jussà
Cristina España-Bonet
Angela Fan
Christian Federman
Yvette Graham
Roman Grundkiewicz
Barry Haddow
Leonie Harter
Kenneth Heafield
Christopher M. Homan
Matthias Huck
Kwabena Amponsah-Kaakyire
Jungo Kasai
Daniel Khashabi
Kevin Knight
Tom Kocmi
Philipp Koehn
Nicholas Lourie
Christof Monz
Makoto Morishita
Masaaki Nagata
Ajay Nagesh
Toshiaki Nakazawa
Matteo Negri
Santanu Pal
Allahsera Tapo
Marco Turchi
Valentin Vydrin
Marcos Zampieri
Proceedings of the Sixth Conference on Machine Translation, Association for Computational Linguistics, Online (2021), pp. 1-88
Google Scholar

Abstract

This paper presents the results of the news translation task, the multilingual low-resource translation for Indo-European languages, the triangular translation task, and the automatic post-editing task organised as part of the Conference on Machine Translation (WMT) 2021. In the news task, participants were asked to build machine translation systems for any of 10 language pairs, to be evaluated on test sets consisting mainly of news stories. The task was also opened up to additional test suites to probe specific aspects of translation. In the Similar Language Translation (SLT) task, participants were asked to develop systems to translate between pairs of similar languages from the Dravidian and Romance family as well as French to two similar low-resource Manding languages (Bambara and Maninka). In the Triangular MT translation task, participants were asked to build a Russian to Chinese translator, given parallel data in Russian-Chinese, RussianEnglish and English-Chinese. In the multilingual low-resource translation for IndoEuropean languages task, participants built multilingual systems to translate among Romance and North-Germanic languages. The task was designed to deal with the translation of documents in the cultural heritage domain for relatively low-resourced languages. In the automatic post-editing (APE) task, participants were asked to develop systems capable to correct the errors made by an unknown machine translation systems.