المستخلص: |
This paper presents a method for automatically annotating Arabic news stories with tags using Wikipedia. The idea ofthe system is to use Wikipedia article names, properties, and re-directs to build a pool of meaningful tags. Efficient matching methods are then used to detect text fragments in input news stories that correspond to entries in the constructed tag pool. Generated tags represent real life entities or concepts such as the names of popular places, known organizations, celebrities, etc. These tags can be used indirectly by a news site for indexing, clustering, classification, statistics generation or directly to give a news reader an overview of news story contents. Evaluation of the system has shown that the tags it generates are better than those generated by MSN Arabic news in terms of precision, recall and accuracy.
|