LEADER |
02191nam a22002657a 4500 |
001 |
1716269 |
024 |
|
|
|3 10.33948/0584-026-004-009
|
041 |
|
|
|a eng
|
044 |
|
|
|b السعودية
|
100 |
|
|
|9 524810
|a El-Taher, Ahmed I.
|e Author
|
245 |
|
|
|a An Arabic CCG Approach For Determining Constituent Types From Arabic Treebank
|
260 |
|
|
|b جامعة الملك سعود
|c 2014
|
300 |
|
|
|a 441 - 449
|
336 |
|
|
|a بحوث ومقالات
|b Article
|
520 |
|
|
|b Converting a tree bank into a CCG bank opens the respective language to the sophisticated tools developed for Combinatory Categorial Grammar (CCG) and enriches cross-linguistic development. The conversion is primarily a three-step process: determining constituents’ types, binarization, and category conversion. Usually, this process involves a preprocessing step to the Treebank of choice for correcting brackets and normalizing tags for any changes that were introduced during the manual annotation, as well as extracting morpho-syntactic information that is necessary for determining constituents’ types. In this article, we describe the required preprocessing step on the Arabic Treebank, as well as how to determine Arabic constituents’ types. We conducted an experiment on parts 1 and 2 of the Penn Arabic Treebank (PATB) aimed at converting the PATB into an Arabic CCG bank. The performance of our algorithm when applied to ATB1v2.0 & ATB2v2.0 was 99% identification of head nodes and 100% coverage over the Treebank data.
|
653 |
|
|
|a اللسانيات الحاسوبية
|a البرمجة اللغوية العصبية
|a اللغة العربية
|
692 |
|
|
|b Arabic
|b CCGbank
|b Treebank
|
700 |
|
|
|9 524811
|a Abo Bakr, Hitahm M.
|e Co-Author
|
700 |
|
|
|9 52388
|a Zidan, Ibrahim
|e Co-Author
|
700 |
|
|
|9 524812
|a Shaalan, Khaled
|e Co-Author
|
773 |
|
|
|c 009
|e Journal of King Saud University (Computer and Information Sciences)
|f Maǧalaẗ ǧamʼaẗ al-malīk Saud : ùlm al-ḥasib wa al-maʼlumat
|l 004
|m مج26, ع4
|o 0584
|s مجلة جامعة الملك سعود - علوم الحاسب والمعلومات
|v 026
|x 1319-1578
|
856 |
|
|
|u 0584-026-004-009.pdf
|
930 |
|
|
|d y
|p y
|
995 |
|
|
|a science
|
999 |
|
|
|c 973372
|d 973372
|