المستخلص: |
This paper presents a hybrid method of based-rules and a machine learning method for tagging Arabic words. As They appear in Arabic language, word may be composed of stem, plus affixes and clitics, a small number of rules dominate the performance (affixes include inflexional markers for tense, gender and number/clitics include some prepositions, conjunctions and others). Tagging is cast as classification task carried out by memory-based learning. This proposed method is based on rules (that considered the post-position, ending of a word, and patterns). For each rule the number of exceptional cases is stored in library. During classification, it is presented to memory based reasoning, its similarity to all examples in memory is computed using a similarity metric, and the tag is determined again. Checking the exceptional cases of rules, more information is made available to the learner for treating those exceptional cases. To evaluate the proposed method a number of experiments has been run, and in order, to prove the importance of the various information in learning.
|