Google Research

Consistency by Agreement in Zero-shot Neural Machine Translation

Proceedings of NAACL (2019)

Abstract

Generalization and reliability of multilingual translation systems often highly depend on the available parallel data for each language pair of interest. In this paper, we focus on zero-shot generalization—a challenging setup that tests systems on translation directions they have never been optimized for at training time. To solve the problem, we (i) reformulate multilingual translation as probabilistic inference and show that standard training is ad-hoc and often results in models unsuitable for zeroshot tasks, (ii) introduce an agreement-based training method that encourages the model to produce equivalent translations of parallel sentences in an auxiliary third language, (iii) make a simple change to the decoder to make agreement losses end-to-end differentiable. We test our mutilingual NMT architectures on multiple public zero-shot translation benchmarks and show that agreement-based learning often results in 2-3 BLEU point improvement over strong baselines without any loss in performance on supervised directions.

Learn more about how we do research

We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work