Multilingual Mix: Example Interpolation Improves Multilingual Neural Machine Translation

Yong Cheng; Ankur Bapna; Orhan Firat; Yuan Cao; Pidong Wang; Wolfgang Macherey

Multilingual Mix: Example Interpolation Improves Multilingual Neural Machine Translation

Yong Cheng

Ankur Bapna

Orhan Firat

Yuan Cao

Pidong Wang

Wolfgang Macherey

ACL 2022

Download Google Scholar

Abstract

Multilingual neural machine translation (NMT) typically learns to maximize the likelihood of training examples from a combination set of multiple language pairs. However, this mechanical combination only relies on the basic sharing to learn the inductive bias, which undermines the generalization and transferability of multilingual NMT models. In this paper, we introduce a multilingual crossover encoder-decoder (mXEnDec) to fuse language pairs at instance level to exploit cross-lingual signals. For better fusions on multilingual data, we propose several techniques to deal with the language interpolation, dissimilar language fusion and heavy data imbalance. Experimental results on a large-scale WMT multilingual data set show that our approach significantly improves model performance on general multilingual test sets and the model transferability on zero-shot test sets (up to $+5.53$ BLEU).
Results on noisy inputs demonstrates the capability of our approach to improve model robustness against the code-switching noise. We also conduct qualitative and quantitative representation comparisons to analyze the advantages of our approach at the representation level.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Multilingual Mix: Example Interpolation Improves Multilingual Neural Machine Translation

Abstract

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs