Unsupervised Context Learning For Speech Recognition

Assaf Michaely; Justin Scheiner; Mohammadreza Ghodsi; Petar Aleksic; Zelin Wu

Unsupervised Context Learning For Speech Recognition

Assaf Michaely

Justin Scheiner

Mohammadreza Ghodsi

Petar Aleksic

Zelin Wu

Spoken Language Technology (SLT) Workshop, IEEE (2016)

Google Scholar

Abstract

It has been shown in the literature that automatic speech
recognition systems can greatly benefit from contextual in-
formation [ref]. The contextual information can be used to
simplify the search and improve recognition accuracy. The
types of useful contextual information can include the name
of the application the user is in, the contents on the user’s
phone screen, user’s location, a certain dialog state, etc.
Building a separate language model for each of these types
of context is not feasible due to limited resources or limited
amount of training data.
In this paper we describe an approach for unsupervised
learning of contextual information and automatic building of
contextual (biasing) models. Our approach can be used to
build a large number of small contextual models from a lim-
ited amount of available unsupervised training data. We de-
scribe how n-grams relevant for a particular context are au-
tomatically selected as well as how an optimal size of a final
contextual model built is chosen. Our experimental results
show great accuracy improvements for several types of con-
text.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Unsupervised Context Learning For Speech Recognition

Abstract

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs