Improved Word Alignment Using a Symmetric Lexicon Model

Richard Zens

Evgeny Matusov

Hermann Ney

Proceedings of the 20th International Conference on Computational Linguistics (Coling), Geneva, Switzerland(2004), pp. 36-42

Download Google Scholar

Abstract

Word-aligned bilingual corpora are an important knowledge source for many tasks in natural language processing. We improve the well-known IBM alignment models, as well as the Hidden-Markov alignment model using a symmetric lexicon model. This symmetrization takes not only the standard translation direction from source to target into account, but also the inverse translation direction from target to source. We present a theoretically sound derivation of these techniques. In addition to the symmetrization, we introduce a smoothed lexicon model. The standard lexicon model is based on full-form words only. We propose a lexicon smoothing method that takes the word base forms explicitly into account. Therefore, it is especially useful for highly inﬂected languages such as German. We evaluate these methods on the German–English Verbmobil task and the French–English Canadian Hansards task. We show statistically signiﬁcant improvements of the alignment quality compared to the best system reported so far. For the Canadian Hansards task, we achieve an improvement of more than 30% relative.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Improved Word Alignment Using a Symmetric Lexicon Model

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Improved Word Alignment Using a Symmetric Lexicon Model

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities