Using syntactic and confusion network structure for out-of-vocabulary word detection
Abstract
This paper addresses the problem of detecting words that are out-of-vocabulary (OOV) for a speech recognition system to improve automatic speech translation. The detection system leverages confidence prediction techniques given a confusion network representation and parsing with OOV word tokens to
identify spans associated with true OOV words. Working in a resource-constrained domain, we achieve OOV detection Fscores of 60-66 and reduce word error rate by 12% re
identify spans associated with true OOV words. Working in a resource-constrained domain, we achieve OOV detection Fscores of 60-66 and reduce word error rate by 12% re