Statistical Machine Translation for Query Expansion in Answer Retrieval

Stefan Riezler
Alexander Vasserman
Vibhu Mittal
Yi Liu
Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL'07), Prague, Czech Republic (2007)

Abstract

This paper presents a novel approach to query expansion in answer retrieval that uses Statistical Machine Translation (SMT) techniques to bridge the lexical gap between questions and answers. SMT-based query expansion is performed on the one hand by using a SMT-based full-sentence paraphraser to introduce synonyms in the context the full query, and on the other hand by training an SMT model on question-answer pairs and expanding queries by answer terms taken from translations of full queries. We compare these global, context-aware query expansion techniques with a baseline tfidf model and local query expansion on a database of 10 million question-answer pairs extracted from FAQ pages. Experimental results show a significant improvement of SMT-based query expansion over both baselines.