Effectively Building Tera Scale MaxEnt Language Models Incorporating Non-Linguistic Signals

Fadi Biadsy; Mohammadreza Ghodsi; Diamantino Caseiro

Effectively Building Tera Scale MaxEnt Language Models Incorporating Non-Linguistic Signals

Fadi Biadsy

Mohammadreza Ghodsi

Diamantino Caseiro

Interpspeech 2017 (2017)

Download Google Scholar

Abstract

Maximum Entropy (MaxEnt) Language Models (LMs) are powerful models
that can incorporate linguistic and non-linguistic contextual signals
in a unified framework, by optimizing a convex loss function.
In addition to their flexibility, a key advantage is their scalability,
in terms of model size and the amount of data that can be used during
training. We present the following two contributions to
MaxEnt training: (1) By leveraging smaller amounts of transcribed
data, we demonstrate that a MaxEnt LM trained on various
types of corpora can be easily adapted to better match the test
distribution of speech recognition; (2) A novel adaptive-training approach that efficiently
models multiple types of non-linguistic features in a
universal model.

We test the impact of these approaches on Google's state-of-the-art
speech recognizer for the task of voice-search transcription and
dictation. Training 10B parameter models utilizing a corpus
of up to 1T words, we show large reductions in word error
rate from adaptation across multiple languages. Also, human evaluations
show strong significant improvements on a wide range of domains from
using non-linguistic signals. For example, adapting to geographical
domains (e.g., US States and cities) affects about 4% of test
utterances, with 2:1 wins to loss ratio.

Research Areas

Machine intelligence

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Effectively Building Tera Scale MaxEnt Language Models Incorporating Non-Linguistic Signals

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs