Google Research

Evaluating the Cross-Lingual Effectiveness of Massively Multilingual Neural Machine Translation

Abstract

Recently proposed Massively Multilingual Neural Machine Translation system has been shown to be capable of translating 102 languages to and from English within a single model. In this paper, we evaluate the cross-lingual effectiveness of representations from the encoder of such a model on 5 downstream classification and sequence tagging tasks spanning more than 50 languages. We compare our results to a strong multilingual baseline, BERT and show modest gains on zero-shot cross-lingual transfer in 4 out of these 5 tasks. Our results provide strong insight into how applicable the representations learned from multilingual machine translation are, across languages and tasks.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work