Efficient Phrase-table Representation for Machine Translation with Applications to Online MT and Speech Translation

Richard Zens

Hermann Ney

Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL), ACL, Rochester, NY(2007), pp. 492-499

Download Google Scholar

Abstract

In phrase-based statistical machine translation, the phrase-table requires a large amount of memory. We will present an efﬁcient representation with two key properties: on-demand loading and a preﬁx tree structure for the source phrases. We will show that this representation scales well to large data tasks and that we are able to store hundreds of millions of phrase pairs in the phrase-table. For the large Chinese–English NIST task, the memory requirements of the phrase-table are reduced to less than 20 MB using the new representation with no loss in translation quality and speed. Additionally, the new representation is not limited to a speciﬁc test set, which is important for online or real-time machine translation. One problem in speech translation is the matching of phrases in the input word graph and the phrase-table. We will describe a novel algorithm that effectively solves this combinatorial problem exploiting the preﬁx tree data structure of the phrase-table. This algorithm enables the use of signiﬁcantly larger input word graphs in a more efﬁcient way resulting in improved translation quality.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Efficient Phrase-table Representation for Machine Translation with Applications to Online MT and Speech Translation

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Efficient Phrase-table Representation for Machine Translation with Applications to Online MT and Speech Translation

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities