Better Alignments = Better Translations?
Abstract
Automatic word alignment is a key step in
training statistical machine translation systems. Despite much recent work on word
alignment methods, alignment accuracy increases often produce little or no improvements in machine translation quality. In
this work we analyze a recently proposed
agreement-constrained EM algorithm for unsupervised alignment models. We attempt to
tease apart the effects that this simple but effective modification has on alignment precision and recall trade-offs, and how rare and
common words are affected across several language pairs. We propose and extensively evaluate a simple method for using alignment
models to produce alignments better-suited
for phrase-based MT systems, and show significant gains (as measured by BLEU score)
in end-to-end translation systems for six languages pairs used in recent MT competitions.