Improvements in phrase-based statistical machine translation
Abstract
In statistical machine translation, the currently
best performing systems are based in some way
on phrases or word groups. We describe the
baseline phrase-based translation system and
various refinements. We describe a highly efficient monotone search algorithm with a complexity linear in the input sentence length. We
present translation results for three tasks: Verbmobil, Xerox and the Canadian Hansards. For
the Xerox task, it takes less than 7 seconds to
translate the whole test set consisting of more
than 10K words. The translation results for
the Xerox and Canadian Hansards task are very
promising. The system even outperforms the
alignment template system.
best performing systems are based in some way
on phrases or word groups. We describe the
baseline phrase-based translation system and
various refinements. We describe a highly efficient monotone search algorithm with a complexity linear in the input sentence length. We
present translation results for three tasks: Verbmobil, Xerox and the Canadian Hansards. For
the Xerox task, it takes less than 7 seconds to
translate the whole test set consisting of more
than 10K words. The translation results for
the Xerox and Canadian Hansards task are very
promising. The system even outperforms the
alignment template system.