PDF Web Documents Categorization Using Association Rules Mining

عبود، فاضل حنون

PDF Web Documents Categorization Using Association Rules Mining

المصدر:	المجلة العراقية لتكنولوجيا المعلومات
الناشر:	الجمعية العراقية لتكنولوجيا المعلومات
المؤلف الرئيسي:	عبود، فاضل حنون (Author)
المؤلف الرئيسي (الإنجليزية):	Abbood, Fadhil Hannoon
المجلد/العدد:	مج6, ع4
محكمة:	نعم
الدولة:	العراق
التاريخ الميلادي:	2014
الصفحات:	125 - 139
DOI:	10.34279/0923-006-004-004
ISSN:	1994-8638
رقم MD:	707879
نوع المحتوى:	بحوث ومقالات
اللغة:	الإنجليزية
قواعد المعلومات:	HumanIndex
مواضيع:	التصنيف وثائق الويب
رابط المحتوى:	PDF (صورة) PDF (نص) HTML

عدد مرات التحميل

30

LEADER	03799nam a22002297a 4500
001	0101022
024		\|3 10.34279/0923-006-004-004
041		\|a eng
044		\|b العراق
100		\|9 369511 \|a عبود، فاضل حنون \|g Abbood, Fadhil Hannoon \|e Author
245		\|a PDF Web Documents Categorization Using Association Rules Mining
260		\|b الجمعية العراقية لتكنولوجيا المعلومات \|c 2014
300		\|a 125 - 139
336		\|a بحوث ومقالات
520		\|a إن تقنية استكشاف قواعد الارتباط استخدمت لاستخلاص الخصائص وقواعد التصنيف باستخدام مجموعة من الوثائق المعدة مسبقا والمعروف أصنافها. لتحقيق أهداف هذا البحث في عملية تصنيف وثائق الويب، تم اعتبار المشكلة من أربعة مهام أساسية وهي، استخلاص النصوص، إعادة معالجة و تمثيل الوثائق، تكوين المصنف وأخيرا تقييم هذا المصنف. تم جمع عدد من ملفات الوثائق المحمولة وتحليلها لاكتشاف عدد من الخصائص الأساسية والمهمة. نتيجة التحليل أدت إلى أن بعض الخصائص الظاهرية يمكن أن تؤثر بشكل كبير جدا على عملية التصنيف وتحسينه. لذلك، تم تكرارها بعدد معين ضمن النصوص. ولغرض زيادة الدقة في البيانات تم تقديم طريقة الكلمات التبادلية التي لها معنى واحد. قائمة من الكلمات غير الضرورية تم جمعها لغرض حذفها. ومن المعروف أن الكثير من كلمات اللغة الانكليزية تحوي ذيل فوضعت خوارزمية لمعالجة ذلك. تم تشذيب القواعد التي لا تحقق بعض الشروط والمتبقي منها أستخدم في عملية التصنيف. تم استخدام مقاييس لقياس دقة المصنف، فتبين أن للمصنف دقة عالية جدا وصلت 97% ونسبة خطأ بلغت 3%. \|b Documents categorization aims to mapping text documents into one or more predefined class based on its contents. This problem has recently attracted many scholars in the web mining and machine learning communities since the numbers of online documents that hold useful information for decision makers, are numerous. This paper investigates the method of classifying PDF Web documents using association rule mining. The number of PDF documents is collected and analyzed, to detect vital and essential features. Ranks values are suggested for these features. A Mutual Meaning Unify (MMU) technique is proposed for increasing the accuracy of documents representations. To reduce the document vector space, stop words are removed. To reduce the documents terms, a stemming algorithm is using. Because the large number of generated rules, a pruning process is proposed to keep on only the highly distinguishing rules. The resulting rules which construct the classifier are used for categorization process. As a result, the classifier is accurate and operates well, it has accuracy about (97%) and the error rate (3%).
653		\|a التصنيف
653		\|a وثائق الويب
773		\|4 علوم المعلومات وعلوم المكتبات \|6 Information Science & Library Science \|c 004 \|e Iraqi Journal of Information Technology \|f Al-Maǧallaẗ al-ʻirāqiyyaẗ li-tiknulūǧiyā al-maʻlūmāt \|l 004 \|m مج6, ع4 \|o 0923 \|s المجلة العراقية لتكنولوجيا المعلومات \|v 006 \|x 1994-8638
856		\|u 0923-006-004-004.pdf
930		\|d y \|p y \|q y
995		\|a HumanIndex
999		\|c 707879 \|d 707879

عناصر مشابهة

A Framework for Ranking and Categorizing Medical Documents
بواسطة: Al Zamil, Mohammed Gh. I. منشور: (2010)
Investigation of associative classification techniques for text categorization
بواسطة: Al Mukhtar, Abd Allah Mohammed A. منشور: (2011)
An efficient associative classification algorithm for text categorization
بواسطة: Abu Rumman, Bashar Suleiman Abd Allah منشور: (2012)
INVESTIGATING THE USE OF FILTERING TECHNIQUES TO IMPROVE THE ACCURACY OF DISCOVERED ASSOCIATION RULES IN NOISY DOMAINS
بواسطة: Darwish, Zainab Ali منشور: (2014)
A collaborative courses recommendation system using mining association rules
بواسطة: السكران، جمال محمد منشور: (2006)

PDF Web Documents Categorization Using Association Rules Mining

عدد مرات التحميل

30

عناصر مشابهة

دليل المستخدم

دليل الفيديو