A Learning Model to Detect Maliciousness of Portable Executable Using Integrated Feature Set

Rout, Ajit Kumar; Kuppusamy, K. S.; Aghila, G.

A Learning Model to Detect Maliciousness of Portable Executable Using Integrated Feature Set

المصدر:	مجلة جامعة الملك سعود - علوم الحاسب والمعلومات
الناشر:	جامعة الملك سعود
المؤلف الرئيسي:	Rout, Ajit Kumar (Author)
مؤلفين آخرين:	Kuppusamy, K. S. (Co-Author) , Aghila, G. (Co-Author)
المجلد/العدد:	مج31, ع2
محكمة:	نعم
الدولة:	السعودية
التاريخ الميلادي:	2019
الصفحات:	254 - 265
DOI:	10.33948/0584-031-002-010
ISSN:	1319-1578
رقم MD:	974631
نوع المحتوى:	بحوث ومقالات
اللغة:	الإنجليزية
قواعد المعلومات:	science
مواضيع:	علوم الحاسوب \| البرمجيات \| التعلم الآلي \| الخوارزميات
كلمات المؤلف المفتاحية:	Malware \| Portable Executable \| Machine Learning \| Integrated Features
رابط المحتوى:	PDF (صورة)

المستخلص:

Malware is one of the top most obstructions for expansion and growth of digital acceptance among the users. Both enterprises and common users are struggling to get protected from the malware in the cyberspace, which emphasizes the importance of developing efficient methods of malware detection. In this work, we propose a machine learning based solution to classify a sample as benign or malware with high accuracy and low computation overhead. An integrated feature set has been amalgamated as a combination of portable executable header fields raw value and derived values. Various machine-learning algorithms such as Decision Tree, Random Forest, kNN, Logistic Regression, Linear Discriminant Analysis and Naive Bayes were adopted in the classification of malware. Using existing raw feature set and the proposed integrated feature set we compared performance of each classifier. The empirical evidence indicates 98.4% classification accuracy in the 10-fold cross validation for the proposed integrated feature set. In the experiments conducted on the novel test data set the accuracy was observed as 89.23% for the integrated feature set which is 15% improvement on accuracy achieved with raw-feature set alone. Classification accuracy with only top N features (N = 5, 10, 15, 20, 25) are also experimented and it was observed that with only top 15 features 98% and 97% accuracy can be achieved on integrated and raw feature respectively. © 2017 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

ISSN:

1319-1578

عناصر مشابهة

Intrusion Detection Model Using Fusion Of Chi-Square Feature Selection And Multi Class SVM
بواسطة: Ikram, Sumaiya Thaseen منشور: (2017)
Improving Package Structure Of Object Oriented Software Using Multi Objective Optimization And Weighted Class Connections
بواسطة: Amarjeet منشور: (2017)
Detecting Redundant Test Cases Using Deep Learning
بواسطة: الدرابسة، رغد جودت منشور: (2022)
A Heuristic Fault Based Optimization Approach To Reduce Test Vectors Count In VLSI Testing
بواسطة: Khera, Vinod Kumar منشور: (2019)
INITIALIZING C-MEANS USING GENETIC ALGORITHMS
بواسطة: Al Shaboul, Bashar Awad منشور: (2006)

A Learning Model to Detect Maliciousness of Portable Executable Using Integrated Feature Set

عناصر مشابهة

دليل المستخدم

دليل الفيديو