Consistency by Agreement in Zero-shot Neural Machine Translation

Maruan Al-Shedivat

Ankur Parikh

Proceedings of NAACL (2019)

Download Google Scholar

Abstract

Generalization and reliability of multilingual translation systems often highly depend on the available parallel data for each language pair of interest. In this paper, we focus on zero-shot generalization—a challenging setup that tests systems on translation directions they have never been optimized for at training time. To solve the problem, we (i) reformulate multilingual translation as probabilistic inference and show that standard training is ad-hoc and often results in models unsuitable for zeroshot tasks, (ii) introduce an agreement-based training method that encourages the model to produce equivalent translations of parallel sentences in an auxiliary third language, (iii) make a simple change to the decoder to make agreement losses end-to-end differentiable. We test our mutilingual NMT architectures on multiple public zero-shot translation benchmarks and show that agreement-based learning often results in 2-3 BLEU point improvement over strong baselines without any loss in performance on supervised directions.

Research Areas

Natural language processing

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Consistency by Agreement in Zero-shot Neural Machine Translation

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs