ارسل ملاحظاتك

ارسل ملاحظاتك لنا







Towards A Standard Part Of Speech Tagset For The Arabic Language

المصدر: مجلة جامعة الملك سعود - علوم الحاسب والمعلومات
الناشر: جامعة الملك سعود
المؤلف الرئيسي: Zeroual, Imad (Author)
مؤلفين آخرين: Lakhouaja, Abdelhak (Co-Author), Belahbib, Rachid (Co-Author)
المجلد/العدد: مج29, ع2
محكمة: نعم
الدولة: السعودية
التاريخ الميلادي: 2017
الصفحات: 171 - 178
DOI: 10.33948/0584-029-002-005
ISSN: 1319-1578
رقم MD: 974093
نوع المحتوى: بحوث ومقالات
اللغة: الإنجليزية
قواعد المعلومات: science
مواضيع:
كلمات المؤلف المفتاحية:
Natural Language Processing | Part Of Speech | Tagging | Arabic Tagset | Tree Tagger
رابط المحتوى:
صورة الغلاف QR قانون
حفظ في:
المستخلص: Part of Speech (PoS) tagging is still not very well investigated with respect to the Arabic language. Determining the PoS tags of a word in a particular context is difficult, primarily because there is no use of diacritics in most of contemporary texts. Consequently, the same word may be spelled in different ways. Further, detecting the difference between Arabic derivatives represents a very challenging issue for the majority of PoS taggers. Hence, the task of tagging the correct PoS tags requires advanced processing and the use of considerable resources. This study aims to design detailed hierarchical levels of the Arabic tagset categories and their relationships. These hierarchical levels allow easier expansion when required and produce more accurate and precise results. They are based on a comparative study and important references in Arabic grammar; they are also validated by experts in this field. In addition, the proposed tagset is implemented in a PoS tagger and tested via various experiments. We believe that our study makes a significant contribution to the literature because this work is an advancement in the direction of achieving a standard, rich, and comprehensive tagset for Arabic.

ISSN: 1319-1578

عناصر مشابهة