Improvements in Dynamic Programming Beam Search for Phrase-based Statistical Machine Translation

Hermann Ney
Proceedings of the International Workshop on Spoken Language Translation, Honolulu, HI(2008), pp. 195-205

Abstract

Search is a central component of any statistical machine translation system. We describe the search for phrase-based SMT in detail and show its importance for achieving good translation quality. We introduce an explicit distinction between reordering and lexical hypotheses and organize the pruning accordingly. We show that for the large Chinese-English NIST task already a small number of lexical alternatives is sufficient, whereas a large number of reordering hypotheses is required to achieve good translation quality. The resulting system compares favorably with the current state-of-the-art, in particular we perform a comparison with cube pruning as well as with Moses.

Research Areas