Using syntactic and confusion network structure for out-of-vocabulary word detection

Alex Marin
Mari Ostendorf
Luke Zettlemoyer
Spoken Language Technology Workshop (SLT), 2012 IEEE

Abstract

This paper addresses the problem of detecting words that are out-of-vocabulary (OOV) for a speech recognition system to improve automatic speech translation. The detection system leverages confidence prediction techniques given a confusion network representation and parsing with OOV word tokens to identify spans associated with true OOV words. Working in a resource-constrained domain, we achieve OOV detection Fscores of 60-66 and reduce word error rate by 12% re

Research Areas