- Yoni Halpern
- Keith Hall
- Vlad Schogol
- Michael Riley
- Brian Roark
- Gleb Skobeltsyn
- Martin Baeuml
Abstract
We introduce an approach to biasing language models towards known contexts without requiring separate language models or explicit contextually-dependent conditioning contexts. We do so by presenting an alternative ASR objective, where we predict the acoustics and words given the contextual cue, such as the geographic location of the speaker. A simple factoring of the model results in an additional biasing term, which effectively indicates how correlated a hypothesis is with the contextual cue (e.g., given the hypothesized transcript, how likely is the user’s known location). We demonstrate that this factorization allows us to train relatively small contextual models which are effective in speech recognition. An experimental analysis shows both a perplexity reduction and a significant word error rate reductions on a voice search task when using the user’s location as a contextual cue.
Research Areas
Learn more about how we do research
We maintain a portfolio of research projects, providing individuals and teams the freedom to emphasize specific types of work