Unsupervised Discovery and Training of Maximally Dissimilar Cluster Models

Francoise Beaufays; Vincent Vanhoucke; Brian Strope

Unsupervised Discovery and Training of Maximally Dissimilar Cluster Models

Francoise Beaufays

Vincent Vanhoucke

Brian Strope

Proc Interspeech (2010)

Google Scholar

Abstract

One of the difficult problems of acoustic modeling for Automatic
Speech Recognition (ASR) is how to adequately model
the wide variety of acoustic conditions which may be present
in the data. The problem is especially acute for tasks such as
Google Search by Voice, where the amount of speech available
per transaction is small, and adaptation techniques start showing
their limitations. As training data from a very large user
population is available however, it is possible to identify and
jointly model subsets of the data with similar acoustic qualities.
We describe a technique which allows us to perform this
modeling at scale on large amounts of data by learning a treestructured
partition of the acoustic space, and we demonstrate
that we can significantly improve recognition accuracy in various
conditions through unsupervised Maximum Mutual Information
(MMI) training. Being fully unsupervised, this technique
scales easily to increasing numbers of conditions.

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Unsupervised Discovery and Training of Maximally Dissimilar Cluster Models

Abstract

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs