Semantic Model for Fast Tagging of Word Lattices
Abstract
This paper introduces a semantic tagger that inserts tags into a word lattice, such as one produced by a real-time large-vocabulary speech recognition system. Benefits of such a tagger the ability to rescore speech recognition hypotheses based on this metadata, as well as providing rich annotations to clients downstream. We focus on the domain of spoken search queries and voice commands, which can be useful for building an intelligent assistant. We explore a method to distill a preexisting very large semantic model into a lightweight tagger. This is accomplished by constructing a joint distribution of tagged n-grams from a supervised training corpus, then deriving a conditional distribution for a given lattice. With 300 classes, the tagger achieves a precision of 88.2% and recall of 93.1% on 1-best paths in speech recognition lattices with 2.8ms median latency.