المستخلص: |
In this address, we propose an approach to Arabic optical text recognition, based on linguistic concepts of Arabic vocabulary. For the text, we allow to categorize the words in decomposable words (derived from a root) and indecomposable words (not derived from a root) and to put forth morpho-syntactic characterization hypotheses for each word For the decomposable words, we try to recognize word basic morphemes: prefix, infix, suffix and root contrary to existing approaches which are usually based on the recognition of word entity by the holistic approach, the pseudo-word entity by pseudo-analytical approach or letter entity by the analytical approach. The indecomposable words are recognized by the analytical approach.
|