Richard Zens

Richard Zens

Authored Publications
Google Publications
Other Publications
Sort By
  • Title
  • Title, descending
  • Year
  • Year, descending
    Content Explorer: Recommending Novel Entities for a Document Writer
    Proceedings of Empirical Methods of Natural Language Processing, EMNLP, 2018.
    Preview abstract Background research is an inseparable part of document writing. Search engines are great for retrieving information once we know what to look for. However, the bigger challenge is often identifying topics for further research. Automated tools could help significantly in this discovery process and increase the productivity of the writer. In this paper, we formulate the problem of recommending topics to a writer. We formulate this as a supervised learning problem and run a user study to validate this approach. We propose an evaluation metric and perform an empirical comparison of state-of-the-art models for extreme multi-label classification on a large data set. We demonstrate how a simple modification of the cross-entropy loss function leads to improved results of the deep learning models. View details
    Fast and Scalable Decoding with Language Model Look-Ahead for Phrase-based Statistical Machine Translation
    Joern Wuebker
    Hermann Ney
    Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Jeju, Republic of Korea(2012), pp. 28-32
    Preview abstract In this work we present two extensions to the well-known dynamic programming beam search in phrase-based statistical machine translation (SMT), aiming at increased effi- ciency of decoding by minimizing the number of language model computations and hypothesis expansions. Our results show that language model based pre-sorting yields a small improvement in translation quality and a speedup by a factor of 2. Two look-ahead methods are shown to further increase translation speed by a factor of 2 without changing the search space and a factor of 4 with the side-effect of some additional search errors. We compare our approach with Moses and observe the same performance, but a substantially better trade-off between translation quality and speed. At a speed of roughly 70 words per second, Moses reaches 17.2% BLEU, whereas our approach yields 20.0% with identical models. View details
    A Systematic Comparison of Phrase Table Pruning Techniques
    Peng Xu
    Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Association for Computational Linguistics, Jeju Island, Korea, pp. 972-983
    Preview abstract When trained on very large parallel corpora, the phrase table component of a machine translation system grows to consume vast computational resources. In this paper, we introduce a novel pruning criterion that places phrase table pruning on a sound theoretical foundation. Systematic experiments on four language pairs under various data conditions show that our principled approach is superior to existing ad hoc pruning methods. View details
    Improvements for Beam Search in Statistical Machine Translation
    Oliver Bender
    Hermann Ney
    Handbook of Natural Language Processing and Machine Translation, Springer(2011)
    Name extraction and translation for distillation
    Heng Ji
    Ralph Grishman
    Dayne Freitag
    Matthias Blume
    John Wang
    Shahram Khadivi
    Hermann Ney
    Handbook of Natural Language Processing and Machine Translation: DARPA Global Autonomous Language Exploitation(2009)
    Preview abstract Name translation is important well beyond the relative frequency of names in a text: a correctly translated passage, but with the wrong name, may lose most of its value. The Nightingale team has built a name translation component which operates in tandem with a conventional phrase-based statistical MT system, identifying names in the source text and proposing translations to the MT system. Versions have been developed for both Chineseto- English and Arabic-to-English name translation. The system has four main components, a name tagger, translation lists, a transliteration engine, and a context-based ranker. This chapter presents these components in detail and investigates the impact of name translation on cross-lingual spoken sentence retrieval. View details
    Improvements in Dynamic Programming Beam Search for Phrase-based Statistical Machine Translation
    Hermann Ney
    Proceedings of the International Workshop on Spoken Language Translation, Honolulu, HI(2008), pp. 195-205
    Preview abstract Search is a central component of any statistical machine translation system. We describe the search for phrase-based SMT in detail and show its importance for achieving good translation quality. We introduce an explicit distinction between reordering and lexical hypotheses and organize the pruning accordingly. We show that for the large Chinese-English NIST task already a small number of lexical alternatives is sufficient, whereas a large number of reordering hypotheses is required to achieve good translation quality. The resulting system compares favorably with the current state-of-the-art, in particular we perform a comparison with cube pruning as well as with Moses. View details
    Efficient Speech Translation through Confusion Network Decoding
    Nicola Bertoldi
    Marcello Federico
    Wade Shen
    IEEE Transactions on Audio, Speech and Language Processing, 16(2008), pp. 1696-1705
    Chunk-level reordering of source language sentences with automatically learned rules for statistical machine translation
    Yuqi Zhang
    Hermann Ney
    Proceedings of the NAACL-HLT 2007/AMTA Workshop on Syntax and Structure in Statistical Translation, Association for Computational Linguistics, Rochester, New York, pp. 1-8
    Preview abstract In this paper, we describe a source-side reordering method based on syntactic chunks for phrase-based statistical machine translation. First, we shallow parse the source language sentences. Then, reordering rules are automatically learned from source-side chunks and word alignments. During translation, the rules are used to generate a reordering lattice for each sentence. Experimental results are reported for a Chinese-to-English task, showing an improvement of 0.5% - 1.8% BLEU score absolute on various test sets and better computational efficiency than reordering during decoding. The experiments also show that the reordering at the chunk-level performs better than at the POS-level. View details
    Efficient Phrase-table Representation for Machine Translation with Applications to Online MT and Speech Translation
    Hermann Ney
    Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL), ACL, Rochester, NY(2007), pp. 492-499
    Preview abstract In phrase-based statistical machine translation, the phrase-table requires a large amount of memory. We will present an efficient representation with two key properties: on-demand loading and a prefix tree structure for the source phrases. We will show that this representation scales well to large data tasks and that we are able to store hundreds of millions of phrase pairs in the phrase-table. For the large Chinese–English NIST task, the memory requirements of the phrase-table are reduced to less than 20 MB using the new representation with no loss in translation quality and speed. Additionally, the new representation is not limited to a specific test set, which is important for online or real-time machine translation. One problem in speech translation is the matching of phrases in the input word graph and the phrase-table. We will describe a novel algorithm that effectively solves this combinatorial problem exploiting the prefix tree data structure of the phrase-table. This algorithm enables the use of significantly larger input word graphs in a more efficient way resulting in improved translation quality. View details
    Minimum Bayes risk decoding for BLEU
    Nicola Ehling
    Hermann Ney
    Proceedings of the 45th Annual Meeting of the ACL(2007)
    Preview abstract We present a Minimum Bayes Risk (MBR) decoder for statistical machine translation. The approach aims to minimize the expected loss of translation errors with regard to the BLEU score. We show that MBR decoding on N-best lists leads to an improvement of translation quality. We report the performance of the MBR decoder on four different tasks: the TC-STAR EPPS Spanish-English task 2006, the NIST Chinese-English task 2005 and the GALE Arabic-English and Chinese-English task 2006. The absolute improvement of the BLEU score is between 0.2% for the TC-STAR task and 1.1% for the GALE Chinese-English task. View details
    Improved chunk-level reordering for statistical machine translation
    Yuqi Zhang
    Hermann Ney
    IWSLT(2007)
    Preview abstract Inspired by previous chunk-level reordering approaches to statistical machine translation, this paper presents two methods to improve the reordering at the chunk level. By introducing a new lattice weighting factor and by reordering the training source data, an improvement is reported on TER and BLEU. Compared to the previous chunklevel reordering approach, the BLEU score improves 1.4% absolutely. The translation results are reported on IWSLT Chinese-English task. View details
    Moses: Open source toolkit for statistical machine translation
    Philipp Koehn
    Hieu Hoang
    Alexandra Birch
    Chris Callison-Burch
    Marcello Federico
    Nicola Bertoldi
    Brooke Cowan
    Wade Shen
    Christine Moran
    Chris Dyer
    Ondrej Bojar
    Alexandra Constantin
    Evan Herbst
    Proceedings of the 45th Annual Meeting of the ACL - Demo and Poster Sessions, Association for Computational Linguistics, Prague, Czech Republic(2007), pp. 177-180
    Preview abstract We describe an open-source toolkit for statistical machine translation whose novel contributions are (a) support for linguistically motivated factors,(b) confusion network decoding, and (c) efficient data formats for translation models and language models. In addition to the SMT decoder, the toolkit also includes a wide variety of tools for training, tuning and applying the system to many translation tasks. View details
    Speech translation by confusion network decoding
    Nicola Bertoldi
    Marcello Federico
    2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07, Honolulu, Hawaii, USA, IV-1297-1300
    Preview abstract This paper describes advances in the use of confusion networks as interface between automatic speech recognition and machine translation. In particular, it presents an implementation of a confusion network decoder which significantly improves both in efficiency and performance previous work along this direction. The confusion network decoder results as an extension of a state-of-the-art phrase-based text translation system. Experimental results in terms of decoding speed and translation accuracy are reported on a real-data task, namely the translation of plenary speeches at the European Parliament from Spanish to English. View details
    A Systematic Comparison of Training Criteria for Statistical Machine Translation
    Sasa Hasan
    Hermann Ney
    Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), ACL, Prague, Czech Republic(2007), pp. 524-532
    Preview abstract We address the problem of training the free parameters of a statistical machine translation system. We show significant improvements over a state-of-the-art minimum error rate training baseline on a large ChineseEnglish translation task. We present novel training criteria based on maximum likelihood estimation and expected loss computation. Additionally, we compare the maximum a-posteriori decision rule and the minimum Bayes risk decision rule. We show that, not only from a theoretical point of view but also in terms of translation quality, the minimum Bayes risk decision rule is preferable. View details
    Discriminative Reordering Models for Statistical Machine Translation
    Hermann Ney
    Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL): Proceedings of the Workshop on Statistical Machine Translation, ACL, New York City, NY(2006), pp. 55-63
    Preview abstract We present discriminative reordering models for phrase-based statistical machine translation. The models are trained using the maximum entropy principle. We use several types of features: based on words, based on word classes, based on the local context. We evaluate the overall performance of the reordering models as well as the contribution of the individual feature types on a word-aligned corpus. Additionally, we show improved translation performance using these reordering models compared to a state-of-the-art baseline system. View details
    The JHU workshop 2006 IWSLT system
    Wade Shen
    Nicola Bertoldi
    Marcello Federico
    IWSLT(2006)
    Preview abstract This paper describes the SMT we built during the 2006 JHU Summer Workshop for the IWSLT 2006 evaluation. Our effort focuses on two parts of the speech translation problem: 1) efficient decoding of word lattices and 2) novel applications of factored translation models to IWSLT-specific problems. In this paper, we present results from the open-track Chinese-to- English condition. Improvements of 5-10% relative BLEU are obtained over a high performing baseline. We introduce a new open-source decoder that implements the state-of- the-art in statistical machine translation. View details
    The RWTH statistical machine translation system for the IWSLT 2006 evaluation
    Arne Mauser
    Evgeny Matusov
    Sasa Hasan
    Hermann Ney
    IWSLT(2006)
    Preview abstract We give an overview of the RWTH phrase-based statistical machine translation system that was used in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2006. The system was ranked first with respect to the BLEU measure in all language pairs it was used. View details
    N-Gram Posterior Probabilities for Statistical Machine Translation
    Hermann Ney
    Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL): Proceedings of the Workshop on Statistical Machine Translation, ACL, New York City, NY(2006), pp. 72-77
    Preview abstract Word posterior probabilities are a common approach for confidence estimation in automatic speech recognition and machine translation. We will generalize this idea and introduce n-gram posterior probabilities and show how these can be used to improve translation quality. Additionally, we will introduce a sentence length model based on posterior probabilities. We will show significant improvements on the Chinese-English NIST task. The absolute improvements of the BLEU score is between 1.1% and 1.6%. View details
    Novel reordering approaches in phrase-based statistical machine translation
    Stephan Kanthak
    David Vilar
    Evgeny Matusov
    Hermann Ney
    Proceedings of the ACL Workshop on Building and Using Parallel Texts, Association for Computational Linguistics, Ann Arbor, Michigan, USA(2005), pp. 167-174
    Preview abstract This paper presents novel approaches to reordering in phrase-based statistical machine translation. We perform consistent reordering of source sentences in training and estimate a statistical translation model. Using this model, we follow a phrase-based monotonic machine translation approach, for which we develop an efficient and flexible reordering framework that allows to easily introduce different reordering constraints. In translation, we apply source sentence reordering on word level and use a reordering automaton as input. We show how to compute reordering automata on-demand using IBM or ITG constraints, and also introduce two new types of reordering constraints. We further add weights to the reordering automata. We present detailed experimental results and show that reordering significantly improves translation quality. View details
    The RWTH Phrase-based Statistical Machine Translation System
    Oliver Bender
    Sasa Hasan
    Shahram Khadivi
    Evgeny Matusov
    Jia Xu
    Yuqi Zhang
    Hermann Ney
    IWSLT(2005)
    Preview abstract We give an overview of the RWTH phrase-based statistical machine translation system that was used in the evaluation campaign of the International Workshop on Spoken Language Translation 2005. We use a two pass approach. In the first pass, we generate a list of the N best translation candidates. The second pass consists of rescoring and reranking this N-best list. We will give a description of the search algorithm as well as the models that are used in each pass. We participated in the supplied data tracks for manual transcriptions for the following translation directions: Arabic-English, Chinese-English, English-Chinese and Japanese-English. For Japanese-English, we also participated in the C-Star track. In addition, we performed translations of automatic speech recognition output for ChineseEnglish and Japanese-English. For both language pairs, we translated the single-best ASR hypotheses. Additionally, we translated Chinese ASR lattices. View details
    Word graphs for statistical machine translation
    Hermann Ney
    Proceedings of the ACL Workshop on Building and Using Parallel Texts(2005), pp. 191-198
    Preview abstract Word graphs have various applications in the field of machine translation. Therefore it is important for machine translation systems to produce compact word graphs of high quality. We will describe the generation of word graphs for state of the art phrase-based statistical machine translation. We will use these word graph to provide an analysis of the search process. We will evaluate the quality of the word graphs using the well-known graph word error rate. Additionally, we introduce the two novel graph-to-string criteria: the position- independent graph word error rate and the graph BLEU score. View details
    Improvements in phrase-based statistical machine translation
    Hermann Ney
    Proceedings of HLT-NAACL, Association for Computational Linguistics, Boston, MA(2004), pp. 257-264
    Preview abstract In statistical machine translation, the currently best performing systems are based in some way on phrases or word groups. We describe the baseline phrase-based translation system and various refinements. We describe a highly efficient monotone search algorithm with a complexity linear in the input sentence length. We present translation results for three tasks: Verbmobil, Xerox and the Canadian Hansards. For the Xerox task, it takes less than 7 seconds to translate the whole test set consisting of more than 10K words. The translation results for the Xerox and Canadian Hansards task are very promising. The system even outperforms the alignment template system. View details
    Symmetric word alignments for statistical machine translation
    Evgeny Matusov
    Hermann Ney
    COLING '04 Proceedings of the 20th international conference on Computational Linguistics, Association for Computational Linguistics, Geneva, Switzerland(2004)
    Preview abstract In this paper, we address the word alignment problem for statistical machine translation. We aim at creating a symmetric word alignment allowing for reliable one-to-many and many-to-one word relationships. We perform the iterative alignment training in the source-to-target and the target-to-source direction with the well-known IBM and HMM alignment models. Using these models, we robustly estimate the local costs of aligning a source word and a target word in each sentence pair. Then, we use efficient graph algorithms to determine the symmetric alignment with minimal total costs (i. e. maximal alignment probability). We evaluate the automatic alignments created in this way on the German–English Verbmobil task and the French–English Canadian Hansards task. We show statistically significant improvements of the alignment quality compared to the best results reported so far. On the Verbmobil task, we achieve an improvement of more than 1% absolute over the baseline error rate of 4.7%. View details
    Improved Word Alignment Using a Symmetric Lexicon Model
    Evgeny Matusov
    Hermann Ney
    Proceedings of the 20th International Conference on Computational Linguistics (Coling), Geneva, Switzerland(2004), pp. 36-42
    Preview abstract Word-aligned bilingual corpora are an important knowledge source for many tasks in natural language processing. We improve the well-known IBM alignment models, as well as the Hidden-Markov alignment model using a symmetric lexicon model. This symmetrization takes not only the standard translation direction from source to target into account, but also the inverse translation direction from target to source. We present a theoretically sound derivation of these techniques. In addition to the symmetrization, we introduce a smoothed lexicon model. The standard lexicon model is based on full-form words only. We propose a lexicon smoothing method that takes the word base forms explicitly into account. Therefore, it is especially useful for highly inflected languages such as German. We evaluate these methods on the German–English Verbmobil task and the French–English Canadian Hansards task. We show statistically significant improvements of the alignment quality compared to the best system reported so far. For the Canadian Hansards task, we achieve an improvement of more than 30% relative. View details
    Reordering Constraints for Phrase-Based Statistical Machine Translation
    Hermann Ney
    Taro Watanabe
    Eiichiro Sumita
    Proceedings of the 20th International Conference on Computational Linguistics (Coling), Geneva, Switzerland(2004), pp. 205-211
    Preview abstract In statistical machine translation, the generation of a translation hypothesis is computationally expensive. If arbitrary reorderings are permitted, the search problem is NP-hard. On the other hand, if we restrict the possible reorderings in an appropriate way, we obtain a polynomial-time search algorithm. We investigate different reordering constraints for phrase-based statistical machine translation, namely the IBM constraints and the ITG constraints. We present efficient dynamic programming algorithms for both constraints. We evaluate the constraints with respect to translation quality on two Japanese–English tasks. We show that the reordering constraints improve translation quality compared to an unconstrained search that permits arbitrary phrase reorderings. The ITG constraints preform best on both tasks and yield statistically significant improvements compared to the unconstrained search. View details
    Alignment templates: the RWTH SMT system
    Oliver Bender
    Evgeny Matusov
    Hermann Ney
    IWSLT(2004)
    Preview abstract In this paper, we describe the RWTH statistical machine translation (SMT) system which is based on log-linear model combination. All knowledge sources are treated as feature functions which depend on the source language sentence, the target language sentence and possible hidden variables. The main feature of our approach are the alignment templates which take shallow phrase structures into account: a phrase level alignment between phrases and a word level alignment between single words within the phrases. Thereby, we directly consider word contexts and local reorderings. In order to incorporate additional models (the IBM-1 statistical lexicon model, a word deletion model, and higher order language models), we perform n-best list rescoring. Participating in the International Workshop on Spoken Language Translation (IWSLT 2004), we evaluate our system on the Basic Travel Expression Corpus (BTEC) Chinese-to-English and Japanese-to-English tasks. View details
    Efficient Search for Interactive Statistical Machine Translation
    Franz Josef Och
    Hermann Ney
    Proceedings of the tenth conference of the European chapter of the Association for Computational Linguistics (EACL), Budapest, Hungary(2003), pp. 387-394
    Preview abstract The goal of interactive machine translation is to improve the productivity of human translators. An interactive machine translation system operates as follows: the automatic system proposes a translation. Now, the human user has two options: to accept the suggestion or to correct it. During the post-editing process, the human user is assisted by the interactive system in the following way: the system suggests an extension of the current translation prefix. Then, the user either accepts this extension (completely or partially) or ignores it. The two most important factors of such an interactive system are the quality of the proposed extensions and the response time. Here, we will use a fully fledged translation system to ensure the quality of the proposed extensions. To achieve fast response times, we will use word hypotheses graphs as an efficient search space representation. We will show results of our approach on the Verbmobil task and on the Canadian Hansards task. View details
    A Comparative Study on Reordering Constraints in Statistical Machine Translation
    Hermann Ney
    Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), ACL, Sapporo, Japan(2003), pp. 144-151
    Preview abstract In statistical machine translation, the generation of a translation hypothesis is computationally expensive. If arbitrary wordreorderings are permitted, the search problem is NP-hard. On the other hand, if we restrict the possible word-reorderings in an appropriate way, we obtain a polynomial-time search algorithm. In this paper, we compare two different reordering constraints, namely the ITG constraints and the IBM constraints. This comparison includes a theoretical discussion on the permitted number of reorderings for each of these constraints. We show a connection between the ITG constraints and the since 1870 known Schroder numbers. We evaluate these constraints on two tasks: the Verbmobil task and the Canadian Hansards task. The evaluation consists of two parts: First, we check how many of the Viterbi alignments of the training corpus satisfy each of these constraints. Second, we restrict the search to each of these constraints and compare the resulting translation hypotheses. The experiments will show that the baseline ITG constraints are not sufficient on the Canadian Hansards task. Therefore, we present an extension to the ITG constraints. These extended ITG constraints increase the alignment coverage from about 87% to 96%. View details
    Phrase-Based Statistical Machine Translation
    Franz Josef Och
    Hermann Ney
    KI(2002), pp. 18-32