Bio-Inspired Feature Extraction and Ensemble Approach to Classify Malware

Gaurav Mehta, Pradeepta Kumar Sarangi, Shaily Jain, Vikas Tripathi

Citation :

Gaurav Mehta, Pradeepta Kumar Sarangi, Shaily Jain, Vikas Tripathi, "Bio-Inspired Feature Extraction and Ensemble Approach to Classify Malware," International Journal of Electronics and Communication Engineering, vol. 12, no. 5, pp. 379-390, 2025. Crossref, https://doi.org/10.14445/23488549/IJECE-V12I5P131

Abstract

Malware classification plays an important role in preventing security threats. As malware affects many different areas like mobile, computer, IoT, etc., an effective classification approach needs to be used, even on big data samples. This paper gives a detailed analysis of ensemble architecture, highlighting the synergistic effect of combining CNNs with ACO and ANN for robust feature extraction and selection to provide a scalable solution for real-time malware detection. We propose a comprehensive model integrating a Convolutional Neural Network for feature extraction, enhanced with a Rectified Linear Unit (ReLU) combined with ACO for feature selection. In our experimental setup, to evaluate the effectiveness of the proposed model, we used the Microsoft BIG15 malware dataset with 9 different classes and obtained an accuracy of 98.76%, surpassing traditional and standalone methods.

Keywords

Malware classification, Ant Colony Optimization, Algorithms, Convolution Neural Network, Transfer, Learning Long Short-Term, Memory.

References

[1] Arvind Mahindru, and A.L. Sangal, “FSDroid: A Feature Selection Technique to Detect Malware from Android Using Machine Learning Techniques,” Multimedia Tools and Applications, vol. 80, pp. 13271-13323, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[2] Malware Statistics and Trend Report, AV-TEST, 2020. [Online]. Available: https://www.av-test.org/en/statistics/malware/
[3] Olga Russakovsky et al., “ImageNet Large Scale Visual Recognition Challenge,” International Journal of Computer Vision, vol. 115, pp. 211-252, 2015.
[CrossRef] [Google Scholar] [Publisher Link]
[4] Parthajit Borah et al., “Unmasking the Common Traits: An Ensemble Approach for Effective Malware Detection,” International Journal of Information Security, vol. 23, pp. 2547-2557, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[5] Pascal Maniriho et al., “MeMalDet: A Memory Analysis-Based Malware Detection Framework Using Deep Autoencoders and Stacked Ensemble under Temporal Evaluations,” Computers & Security, vol. 142, pp. 1-20, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[6] S. Akarsh et al., “Deep Learning Framework and Visualization for Malware Classification,” 2019 5th International Conference on Advanced Computing & Communication Systems, Coimbatore, India, pp. 1059-1063, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[7] Yan Lu, Jonathan Graham, and Jiang Li, “Deep Learning Based Malware Classification Using Deep Residual Network,” 13th Annual Modeling, Simulation & Visualization Student Capstone Conference, Suffolk, VA, pp. 126-131, 2019.
[Google Scholar] [Publisher Link]
[8] J.R. Goodall, Introduction to Visualization for Computer Security, Proceedings of the Workshop on Visualization for Computer Security, Springer, Berlin, Heidelberg, pp. 1-17, 2008.
[CrossRef] [Google Scholar] [Publisher Link]
[9] Barath Narayanan Narayanan, Ouboti Djaneye-Boundjou, and Temesguen M. Kebede, “Performance Analysis of Machine Learning and Pattern Recognition Algorithms for Malware Classification,” 2016 IEEE National Aerospace and Electronics Conference (NAECON) and Ohio Innovation Summit (OIS), Dayton, OH, USA, pp. 338-342, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[10] Zhuojun Ren, Guang Chen, and Wenke Lu, “Malware Visualization Methods Based on Deep Convolution Neural Networks,” Multimedia Tools and Applications, vol. 79, pp. 10975-10993, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[11] Muhammad Furqan Rafique et al., “Malware Classification Using Deep Learning Based Feature Extraction and Wrapper Based Feature Selection Technique,” arXiv Preprint, pp. 1-21, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[12] Hamad Naeem et al., “Malware Detection in Industrial Internet of Things Based on Hybrid Image Visualization and Deep Learning Model,” Ad Hoc Networks, vol. 105, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[13] Jungho Kang et al., “Long Short-Term Memory-Based Malware Classification Method for Information Security,” Computers & Electrical Engineering, vol. 77, pp. 366-375, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[14] Sanjeev Kumar, and B. Janet, “DTMIC: Deep Transfer Learning for Malware Image Classification,” Journal of Information Security and Applications, vol. 64, 2022.
[CrossRef] [Google Scholar] [Publisher Link]
[15] Sudhakar, and Sushil Kumar, “MCFT-CNN: Malware Classification with Fine-Tune Convolution Neural Networks Using Traditional and Transfer Learning in Internet of Things,” Future Generation Computer Systems, vol. 125, pp. 334-351, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[16] John Donahue, Anand Paturi, and Srinivas Mukkamala, “Visualization Techniques for Efficient Malware Detection,” 2013 IEEE International Conference on Intelligence and Security Informatics, Seattle, WA, USA, pp. 289-291, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[17] Kyoungsoo Han, Jaehyun Lim, and Eul-gyu Im, “Malware Analysis Method Using Visualization of Binary Files,” Proceedings of the 2013 Research in Adaptive and Convergent Systems, Montreal Quebec, Canada, pp. 317-321, 2013.
[CrossRef] [Google Scholar] [Publisher Link]
[18] Ahmed Bensaoud, Jugal Kalita, and Mahmoud Bensaoud, “A Survey of Malware Detection Using Deep Learning,” Machine Learning with Applications, vol. 16, pp. 1-16, 2024.
[CrossRef] [Google Scholar] [Publisher Link]
[19] Kamran Shaukat, Suhuai Luo, and Vijay Varadharajan, “A Novel Deep Learning-Based Approach for Malware Detection,” Engineering Applications of Artificial Intelligence, vol. 122, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[20] M. Gopinath, and Sibi Chakkaravarthy Sethuraman, “A Comprehensive Survey on Deep Learning Based Malware Detection Techniques,” Computer Science Review, vol. 47, 2023.
[CrossRef] [Google Scholar] [Publisher Link]
[21] Junyang Qiu et al., “A Survey of Android Malware Detection with Deep Neural Models,” ACM Computing Surveys, vol. 53, no. 6, pp. 1-36, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[22] Microsoft Malware Classification Challenge (BIG 2015), Kaggle. [Online]. Available: https://www.kaggle.com/c/malware-classification
[23] Lakshmanan Nataraj et al., “Malware Images: Visualization and Automatic Classification,” Proceedings of the 8th International Symposium on Visualization for Cyber Security, Pittsburgh Pennsylvania USA, pp. 1-7, 2011.
[CrossRef] [Google Scholar] [Publisher Link]
[24] Jixin Zhang et al., “Malware Variant Detection Using Opcode Image Recognition with Small Training Sets,” 2016 25th International Conference on Computer Communication and Networks, Waikoloa, HI, USA, pp. 1-9, 2016.
[CrossRef] [Google Scholar] [Publisher Link]
[25] Songqing Yue, and Tianyang Wang, “Imbalanced Malware Images Classification: A CNN-Based Approach,” Arxiv, pp. 1-5, 2017.
[CrossRef] [Google Scholar] [Publisher Link]
[26] Sang Ni, Quan Qian, and Rui Zhang, “Malware Identification Using Visualization Images and Deep Learning,” Computers & Security, vol. 77, pp. 871-885, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[27] Zhihua Cui et al., “Detection of Malicious Code Variants Based on Deep Learning,” IEEE Transactions on Industrial Informatics, vol. 14, no. 7, pp. 3187-3196, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[28] Guosong Sun, and Quan Qian, “Deep Learning and Visualization for Identifying Malware Families,” IEEE Transactions on Dependable and Secure Computing, vol. 18, no. 1, pp. 283-295, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[29] Yusheng Dai et al., “A Malware Classification Method Based on Memory Dump Grayscale Image,” Digital Investigation, vol. 27, pp. 30-37, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[30] Quan Le et al., “Deep Learning at the Shallow End: Malware Classification for Non-Domain Experts,” Digital Investigation, vol. 26, pp. S118-S126, 2018.
[CrossRef] [Google Scholar] [Publisher Link]
[31] Sitalakshmi Venkatraman, Mamoun Alazab, and R. Vinayakumar, “A Hybrid Deep Learning Image-Based Analysis for Effective Malware Detection,” Journal of Information Security and Applications, vol. 47, pp. 377-389, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[32] Venkata Salini Priyamvada Davuluru, Barath Narayanan Narayanan, and Eric J. Balster, “Convolutional Neural Networks as Classification Tools and Feature Extractors for Distinguishing Malware Programs,” 2019 IEEE National Aerospace and Electronics Conference, Dayton, OH, USA, pp. 273-278, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[33] Zhihua Cui et al., “Malicious Code Detection Based on CNNs and Multi-Objective Algorithm,” Journal of Parallel and Distributed Computing, vol. 129, pp. 50-58, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[34] Shiva Darshan S.L., and Jaidhar C.D., “Windows Malware Detector using Convolutional Neural Network Based on Visualization Images,” IEEE Transactions on Emerging Topics in Computing, vol. 9, no. 2, pp. 1057-1069, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[35] Yusheng Dai et al., “SMASH: A Malware Detection Method Based on Multi-Feature Ensemble Learning,” IEEE Access, vol. 7, pp. 112588-112597, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[36] T. Jayalakshmi, and A. Santhakumaran, “Statistical Normalization and Back Propagation for Classification,” International Journal of Computer Theory and Engineering, vol. 3, no. 1, pp. 1793-8201, 2011.
[CrossRef] [Google Scholar] [Publisher Link]
[37] David M. Rocke et al., “Papers on Normalization, Variable Selection, Classification or Clustering of Microarray Data,” Bioinformatics, vol. 25, no. 6, pp. 701-702, 2009.
[CrossRef] [Google Scholar] [Publisher Link]
[38] Ravindra Singh, and Naurang Singh Mangat, Stratified Sampling, Elements of Survey Sampling, pp. 102-144, 2014.
[CrossRef] [Google Scholar] [Publisher Link]
[39] Gaganpreet Sharma, “Pros and Cons of Different Sampling Techniques,” International Journal of Applied Research, vol. 3, no. 7, pp. 749-752, 2017.
[Google Scholar] [Publisher Link]
[40] Prerna Agrawal, and Bhushan Trivedi, “Machine Learning Classifiers for Android Malware Detection,” Data Management, Analytics and Innovation, pp. 311-322, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[41] Xiaonan Zou et al., “Logistic Regression Model Optimization and Case Analysis,” 2019 IEEE 7th International Conference on Computer Science and Network Technology, Dalian, China, pp. 135-139, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[42] Rajni Bala, and Dharmender Kumar, “Classification Using ANN: A Review,” International Journal of Computational Intelligence Research, vol. 13, no. 7, pp. 1811-1820, 2017.
[Google Scholar] [Publisher Link]
[43] Arpana Mahajan, Kavitha Somaraj, and Mustafa Sameer, “Adopting Artificial Intelligence Powered ConvNet to Detect Epileptic Seizures,” 2020 IEEE-EMBS Conference on Biomedical Engineering and Sciences, Langkawi Island, Malaysia, pp. 427-432, 2021.
[CrossRef] [Google Scholar] [Publisher Link]
[44] Khyati Rami, and Vinod Desai, “Malware Detection Framework Using PCA Based ANN,” Computing Science, Communication and Security, pp. 298-313, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[45] Danish Vasan et al., “Image-Based Malware Classification Using Ensemble of CNN Architectures (IMCEC),” Computers & Security, vol. 92, 2020.
[CrossRef] [Google Scholar] [Publisher Link]
[46] Saad Albawi, Tareq Abed Mohammed, and Saad Al-Zawi, “Understanding of a Convolutional Neural Network,” 2017 International Conference on Engineering and Technology, Antalya, Turkey, pp. 1-6, 2017.
[CrossRef] [Publisher Link]
[47] Xi Xiao et al., “Android Malware Detection based on System Call Sequences and LSTM,” Multimedia Tools and Applications, vol. 78, pp. 3979-3999, 2019.
[CrossRef] [Google Scholar] [Publisher Link]
[48] Waseem Ullah et al., “CNN Features with Bi-Directional LSTM for Real-Time Anomaly Detection in Surveillance Networks,” Multimedia Tools and Applications, vol. 80, pp. 16979-16995, 2021.
[CrossRef] [Google Scholar] [Publisher Link]