- Ciprian Chelba
- Thorsten Brants
- Will Neveitt
- Peng Xu
The paper presents an in-depth analysis of a less known interaction between Kneser-Ney smoothing and entropy pruning that leads to severe degradation in language model performance under aggressive pruning regimes. Experiments in a data-rich setup such as google.com voice search show a significant impact in WER as well: pruning Kneser-Ney and Katz models to 0.1% of their original impacts speech recognition accuracy significantly, approx. 10% relative.
Any third party with LDC membership should be able to reproduce our experiments using the scripts available at http://code.google.com/p/kneser-ney-pruning-experiments.