RWTH OCR: A Large Vocabulary Optical Character Recognition System for Arabic Scripts

Philippe Dreuw
Hermann Ney
Guide to OCR for Arabic Scripts, Springer(2012), pp. 215-254

Abstract

We present a novel large vocabulary OCR system, which implements a confidence- and margin-based discriminative training approach for model adaptation of an HMM based recognition system to handle multiple fonts, different handwriting styles, and their variations. Most current HMM approaches are HTK based systems which are maximum-likelihood (ML) trained and which try to adapt their models to different writing styles using writer adaptive training, unsupervised clustering, or additional writer specific data. Here, discriminative training based on the Maximum Mutual Information (MMI) and Minimum Phone Error (MPE) criteria are used instead. For model adaptation during decoding, an unsupervised confidence-based discriminative training within a two-pass decoding process is proposed. Additionally, we use neural network based features extracted by a hierarchical multi-layer-perceptron (MLP) network either in a hybrid MLP/HMM approach or to discriminatively retrain a Gaussian HMM system in a tandem approach. The proposed framework and methods are evaluated for closed-vocabulary isolated handwritten word recognition on the IfN/ENIT Arabic handwriting database, where the word-error-rate is decreased by more than 50% relative compared to a ML trained baseline system. Preliminary results for large-vocabulary Arabic machine printed text recognition tasks are presented on a novel publicly available newspaper database.

Research Areas