ارسل ملاحظاتك

ارسل ملاحظاتك لنا







Cybercrime and Authorship Detection in Very Short Texts: A Quantitative Morpho-Lexical Approach

المصدر: مجلة البحث العلمي في الآداب
الناشر: جامعة عين شمس - كلية البنات للآداب والعلوم والتربية
المؤلف الرئيسي: Omar, Abdulfattah (Author)
المجلد/العدد: ع20, ج1
محكمة: نعم
الدولة: مصر
التاريخ الميلادي: 2019
الصفحات: 291 - 316
DOI: 10.21608/JSSA.2019.38725
ISSN: 2356-8321
رقم MD: 978060
نوع المحتوى: بحوث ومقالات
اللغة: الإنجليزية
قواعد المعلومات: AraBase
مواضيع:
كلمات المؤلف المفتاحية:
Authorship Detection | Forensic Linguistics | Morphological Patterns | Lexical Features | Letter Pair Frequencies | Self Organizing Maps (Soms)
رابط المحتوى:
صورة الغلاف QR قانون

عدد مرات التحميل

23

حفظ في:
LEADER 02754nam a22002297a 4500
001 1720501
024 |3 10.21608/JSSA.2019.38725 
041 |a eng 
044 |b مصر 
100 |9 521057  |a Omar, Abdulfattah  |e Author 
245 |a Cybercrime and Authorship Detection in Very Short Texts:  |b A Quantitative Morpho-Lexical Approach 
260 |b جامعة عين شمس - كلية البنات للآداب والعلوم والتربية  |c 2019 
300 |a 291 - 316 
336 |a بحوث ومقالات  |b Article 
520 |b The present study proposes an integrated framework that considers letter- pair frequencies / combinations along with the lexical features of documents. Drawing on a quantitative morpho-lexical approach, the study tests the hypothesis that letter information or mapping carries unique stylistic features; and therefore detecting stable word combinations and morphological patterns can be used to enhance the authorship performance in relation to very short texts. The data used for analysis is a corpus of 12240 tweets derived from 87 Twitter accounts. Self-organizing maps (SOMs) model is used for classifying the input patterns that share common features together as a clue that tweets grouped under one class membership are written by the same author. Results indicate that the classification accuracy based on the proposed system is around 76%. Up to 22% of this accuracy was lost, however, when only distinctive words were used, and 26% was lost when the classification performance was based on letter combinations and morphological patterns only. The integration of letter-pairs and morphological patterns had the advantage of improving the accuracy of determining the author of a given tweet. This indicates that the integration of different linguistic variables into an integrated system leads to a better classification performance of very short texts. It is also clear that the use of the self-organizing map (SOM) led to better clustering performance for its capacity to integrate two different linguistic levels of each author profile together. 
653 |a الجرائم الإلكترونية  |a حقوق الملكية  |a اللسانيات الحاسوبية  |a خرائط التنظيم الذاتي 
692 |b Authorship Detection  |b Forensic Linguistics  |b Morphological Patterns  |b Lexical Features  |b Letter Pair Frequencies  |b Self Organizing Maps (Soms) 
773 |4 الادب  |6 Literature  |c 010  |e Academic Research Journal for Arts  |f Mağallaẗ Al-Baḥṯ Al-ʿilmī Fī Al-Ādāb  |l 001  |m ع20, ج1  |o 0795  |s مجلة البحث العلمي في الآداب  |v 020  |x 2356-8321 
856 |u 0795-020-001-010.pdf  |n https://jssa.journals.ekb.eg/article_38725.html 
930 |d y  |p y  |q y 
995 |a AraBase 
999 |c 978060  |d 978060 

عناصر مشابهة