Richard Zens
Research Areas
Authored Publications
Sort By
Content Explorer: Recommending Novel Entities for a Document Writer
Proceedings of Empirical Methods of Natural Language Processing, EMNLP, 2018.
Preview abstract
Background research is an inseparable part of document writing. Search engines are great for retrieving information once we know what to look for. However, the bigger challenge is often identifying topics for further research.
Automated tools could help significantly in this discovery process and increase the productivity of the writer.
In this paper, we formulate the problem of recommending topics to a writer.
We formulate this as a supervised learning problem and run a user study to validate this approach.
We propose an evaluation metric and perform an empirical comparison of state-of-the-art models for extreme multi-label classification on a large data set.
We demonstrate how a simple modification of the cross-entropy loss function leads to improved results of the deep learning models.
View details
A Systematic Comparison of Phrase Table Pruning Techniques
Peng Xu
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Association for Computational Linguistics, Jeju Island, Korea, pp. 972-983
Preview abstract
When trained on very large parallel corpora, the phrase table component of a machine translation system grows to consume vast computational resources. In this paper, we introduce a novel pruning criterion that places phrase table pruning on a sound theoretical foundation. Systematic experiments on four language pairs under various data conditions show that our principled approach is superior to existing ad hoc pruning methods.
View details
Fast and Scalable Decoding with Language Model Look-Ahead for Phrase-based Statistical Machine Translation
Joern Wuebker
Hermann Ney
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, Jeju, Republic of Korea (2012), pp. 28-32
Preview abstract
In this work we present two extensions to
the well-known dynamic programming beam
search in phrase-based statistical machine
translation (SMT), aiming at increased effi-
ciency of decoding by minimizing the number
of language model computations and hypothesis expansions. Our results show that language
model based pre-sorting yields a small improvement in translation quality and a speedup
by a factor of 2. Two look-ahead methods are
shown to further increase translation speed by
a factor of 2 without changing the search space
and a factor of 4 with the side-effect of some
additional search errors. We compare our approach with Moses and observe the same performance, but a substantially better trade-off
between translation quality and speed. At a
speed of roughly 70 words per second, Moses
reaches 17.2% BLEU, whereas our approach
yields 20.0% with identical models.
View details