Uncovering Malware Families Using Convolutional Neural Networks (CNN)

Authors

  • Ruly Sumargo Universitas Pradita, Tangerang
  • Handri Santoso Universitas Pradita, Tangerang

DOI:

https://doi.org/10.24014/ijaidm.v7i1.27243

Keywords:

Convolutional Neural Network, Cyberattack, Email, Malware, Python

Abstract

Malware attacks pose significant cyber threats, with a rising number of vulnerability reports in security communities due to the continual introduction of mutations by malware programmers to evade detection. One of the most attractive targets which attacked by malware is the organization emails system. Malware’s mutations within the malware family, has complicating the development of effective machine learning-based malware analysis and classification methods. To answer this challenge, this research uses an agnostic deep learning solution inspired by ImageNet's success, which efficiently classifies malware into families by analyzing visual representations of malicious software as greyscale images using a Convolutional Neural Network (CNN). The Malwizard is a flexible Python tool suitable for both organizations and end-users enabling automated and rapid malware analysis within email system. Malwizard could be use as an Outlook Email’s add-in and an API service for SOAR platforms. The study evaluates this novel approach using the Microsoft Classification Challenge dataset, where image representations are encrypted to address privacy concerns. Experimental results show that the proposed approach performs comparably to the best existing model on plain text data, accomplishing the task in one-third of the time. For the encrypted dataset, adjustments to classical techniques are necessary for improved efficiency.

References

A. R. Yogasware, D. R. Akbi, and V. R. Nastiti, “Klasifikasi Malware Family Menggunakan Metode K-Nearest Neighbor (K-NN),” REPOSITOR, pp. 305–314, 2021, doi: https://doi.org/10.47065/bits.v5i1.3538.

Y. D. Puji Rahayu and Nanang Trianto, “Analisis Malware Menggunakan Metode Analisis Statis dan Dinamis untuk Pembuatan IOC Berdasarkan STIX Versi 2.1,” Info Kripto, vol. 15, no. 3, pp. 105–111, Nov. 2021, doi: 10.56706/ik.v15i3.30.

F. Panjaitan, H. Yudiastuti, and M. Ulfa, “Analisis Malware dengan metode Surface dan Runtime Analysis,” J. Ilm. Matrik, vol. 23, no. 1, pp. 1–11, Apr. 2021, doi: 10.33557/jurnalmatrik.v23i1.1148.

S. Adiwal, A. Gupta, B. Rajendran, and B. S. Bindhumadhava, “A Secure Methodology for Filtering Spam & Malware in E-mail System and Secure E-mail Testbed Setup,” Int. J. Adv. Trends Comput. Sci. Eng., vol. 10, no. 2, pp. 651–657, Apr. 2021, doi: 10.30534/ijatcse/2021/271022021.

M. Office, Chrisda, and V. Ratulach, “Email authentication in EOP,” 2023. https://learn.microsoft.com/en-us/microsoft-365/security/office-365-security/email-authentication-about?view=o365-worldwide (accessed Aug. 22, 2023).

T. Muralidharan and N. Nissim, “Improving malicious email detection through novel designated deep-learning architectures utilizing entire email,” Neural Networks, vol. 157, pp. 257–279, Jan. 2023, doi: 10.1016/j.neunet.2022.09.002.

N. Suri, L. Barriga, M. Franco, F. Cutas, and C. Ardagna, “Work Package 4: Policy and the European Dimension Preliminary Version of Deliverable D 4.4: Cybersecurity Roadmap for Europe by CONCORDIA,” 2020. [Online]. Available: https://www.concordia-h2020.eu/wp-content/uploads/2021/03/Deliverables_D4.4-M24.pdf

European Union Agency for Cybersecurity (ENISA), “List of top 15 threats.” https://www.enisa.europa.eu/topics/cyber-threats/threats-and-trends/etl-review-folder/etl-2020-enisas-list-of-top-15-threats (accessed Aug. 25, 2022).

M. Hazri, “Analisis Malware PlasmaRAT dengan Metode Reverse Engineering,” J. Rekayasa Teknol. Inf., vol. 4, no. 2, p. 192, Nov. 2020, doi: 10.30872/jurti.v4i2.4131.

H. Saputra, S. Basuki, and M. Faiqurahman, “Implementasi Teknik Seleksi Fitur Pada Klasifikasi Malware Android Menggunakan Support Vector Machine,” Fountain Informatics J., vol. 3, no. 1, p. 12, May 2018, doi: 10.21111/fij.v3i1.1875.

D. Efriyani and F. Panjaitan, “Klasifikasi Malware Dengan Menggunakan Recurrent Neural Network,” J. Ilm. Matrik, vol. 23, no. 3, pp. 310–316, 2021, doi: 10.33557/jurnalmatrik.v23i3.1592.

P. B. N. Setio, D. R. S. Saputro, and Bowo Winarno, “Klasifikasi Dengan Pohon Keputusan Berbasis Algoritme C4.5,” Prism. Pros. Semin. Nas. Mat., vol. 3, pp. 64–71, 2020, [Online]. Available: https://journal.unnes.ac.id/sju/index.php/prisma/article/download/37650/15478/

A. Djenna, A. Bouridane, S. Rubab, and I. M. Marou, “Artificial Intelligence-Based Malware Detection, Analysis, and Mitigation,” Symmetry (Basel)., vol. 15, no. 3, p. 677, Mar. 2023, doi: 10.3390/sym15030677.

J. Pavithra and S. Selvakumara Samy, “A Comparative Study on Detection of Malware and Benign on the Internet Using Machine Learning Classifiers,” Math. Probl. Eng., vol. 2022, pp. 1–8, Jun. 2022, doi: 10.1155/2022/4893390.

F. A. Khatami, B. Irawan, and C. Setianingsih, “Analisis Sentimen Terhadap Review Aplikasi Layanan E-Commerce Menggunakan Metode Convolutional Neural Network,” e-Proceeding Eng., vol. 7, no. 2, pp. 4559–4566, 2020, [Online]. Available: https://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/view/12305

M. Afif, A. Fawwaz, K. N. Ramadhani, and F. Sthevanie, “Klasifikasi Ras pada Kucing menggunakan Algoritma Convolutional Neural Network(CNN),” J. Tugas Akhir Fak. Inform., vol. 8, no. 1, pp. 715–730, 2020, doi: https://doi.org/10.34818/eoe.v8i1.14320.

Phrabu, “Understanding of Convolutional Neural Network (CNN) — Deep Learning,” 2018.

P. O. A. Sunarya, R. Refianti, A. B. Mutiara, and W. Octaviani, “Comparison of accuracy between convolutional neural networks and Naïve Bayes Classifiers in sentiment analysis on Twitter,” Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 5, pp. 77–86, 2019, doi: 10.14569/ijacsa.2019.0100511.

Yudi Widhiyasana, Transmissia Semiawan, Ilham Gibran Achmad Mudzakir, and Muhammad Randi Noor, “Penerapan Convolutional Long Short-Term Memory untuk Klasifikasi Teks Berita Bahasa Indonesia,” J. Nas. Tek. Elektro dan Teknol. Inf., vol. 10, no. 4, pp. 354–361, 2021, doi: 10.22146/jnteti.v10i4.2438.

C. R. Kothari, Research Methodology Methods & Techniques, 2nd Editio. New Delhi: NEW AGE INTERNATIONAL PUBLISHERS, 2004.

K. A. Adams and E. K. Lawrence, Research Methods, Statistics, and Applications, 2nd Editio. Los Angeles: SAGE, 2019.

R. Ronen, M. Radu, C. Feuerstein, E. Yom-Tov, and M. Ahmadi, Microsoft Malware Classification Challenge. 2018. doi: https://doi.org/10.48550/arXiv.1802.10135.

StrangeBee, “The Hive a 4 in 1 Security Incident Response Platform,” 2019. https://thehive-project.org/ (accessed Jun. 03, 2023).

R. Gilad-Bachrach, N. Dowlin, K. Laine, K. Lauter, M. Naehrig, and J. Wernsing, “Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy,” 33rd Int. Conf. Mach. Learn. ICML 2016, vol. 1, pp. 342–351, 2016.

E. Hesamifard, H. Takabi, and M. Ghasemi, “CryptoDL: Deep Neural Networks over Encrypted Data,” Nov. 2017, doi: https://arxiv.org/pdf/1711.05189.pdf.

Downloads

Published

2023-12-25

Issue

Section

Articles