Jump to Content

Google's Cross-Dialect Arabic Voice Search

Martin Jansche
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2012), pp. 4441-4444


We present a large scale effort to build a commercial Automatic Speech Recognition (ASR) product for Arabic. Our goal is to support voice search, dictation, and voice control for the general Arabic-speaking public, including support for multiple Arabic dialects. We describe our ASR system design and compare recognizers for five Arabic dialects, with the potential to reach more than 125 million people in Egypt, Jordan, Lebanon, Saudi Arabia, and the United Arab Emirates (UAE). We compare systems built on diacritized vs. non-diacritized text. We also conduct cross-dialect experiments, where we train on one dialect and test on the others. Our average word error rate (WER) is 24.8% for voice search.

Research Areas