Acoustic Sensitive Language Model Perplexity for Automatic Speech Recognition

Ciprian Chelba

Acoustic Sensitive Language Model Perplexity for Automatic Speech Recognition

Ciprian Chelba

Proceedings of Machine Learning Workshop, Snowbird, UT (2006)

Google Scholar

Abstract

Traditional evaluation of language models (LM) for automatic speech recognition (ASR) uses either the information theoretic -motivated perplexity (PPL) or the word error rate (WER) — measured by plugging the model in a speech recognizer.

It is a well known fact that WER and PPL and poorly correlated. The main reason is probably
the fact that PPL measures the predictive power of the LM on correct text, whereas at recognition time the LM needs to discriminate between alternates suggested by the acoustic model used in the recognizer. Since the LM is estimated using maximum-likelihood methods on correct (well-formed) sentences, it is poorly suited for discriminating among the candidates proposed by the acoustic model as likely candidates.

We propose a new evaluation metric for LMs that takes into account the coupling between language model and acoustic model in a given ASR system. The new metric, “acoustic model -sensitive” perplexity (AMS-PPL), aims at allowing one to optimize the LM parameters such that it performs best when used with a given acoustic model. The underlying main idea is to estimate the conditional cross-entropy H(W|A) for the correct word sequence W when the acoustic signal to be decoded was A.

Research Areas

Natural language processing

Explore our many areas of focus

Building a collaborative ecosystem

Shaping the future together

Translating discovery into real-world impact

Acoustic Sensitive Language Model Perplexity for Automatic Speech Recognition

Abstract

Research Areas

Meet the teams driving innovation

Google AI

Google Cloud

Google DeepMind

Google Labs