Advances in Arabic Broadcast News Transcription at RWTH

Stefan Hahn
Christian Gollan
Ralf Schluter
Hermann Ney
IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)(2007), pp. 449-454


This paper describes the RWTH speech recognition system for Arabic. Several design aspects of the system, including cross-adaptation, multiple system design and combination, are analyzed. We summarize the semi-automatic lexicon generation for Arabic using a statistical approach to grapheme-to-phoneme conversion and pronunciation statistics. Furthermore, a novel ASR-based audio segmentation algorithm is presented. Finally, we discuss practical approaches for parallelized acoustic training and memory efficient lattice rescoring. Systematic results are reported on recent GALE evaluation corpora.

Research Areas