The Power of Language Music: Arabic Lemmatization through Patterns
Abstract
Patterns play a pivotal role in Arabic morphological processing whether related to derivation or inflection. These patterns have not been yet adequately and fully utilized in computational processing of the language. The novel contribution of this paper is performing lemmatization (a high level lexical processing) without relying on a lookup dictionary. We use a machine learning classifier to predict the lemma pattern for a given stem, and use mapping rules to convert stems to their respective lemmas.